Skip to main content
2 of 4
added 330 characters in body

Delete multi-line strings

There have been several questions here regarding replacing multi-line strings using the Unix shell, but I haven't found one that will work in this situation.

I am trying to remove keys and constraints from some MySQL DDL, which looks like this (one example):

CREATE TABLE `access_group` (
  `GROUP_ID` int(10) NOT NULL AUTO_INCREMENT,
  `PARENT_GROUP_ID` int(10) DEFAULT NULL,
  `GROUP_NAME` varchar(45) NOT NULL,
  `GROUP_DESC` varchar(45) NOT NULL DEFAULT '',
  PRIMARY KEY (`GROUP_ID`),
  KEY `testkey` (`PARENT_GROUP_ID`)
) ENGINE=InnoDB AUTO_INCREMENT=66 DEFAULT CHARSET=latin1;

I want to remove everything from the comma ending the line before 'PRIMARY KEY' up to, but not including ') ENGINE=' (there can be zero or multiple lines between these, and they won't always start with KEY or have the parenthesis, but the ') ENGINE=' is consistent). The result should look like this:

CREATE TABLE `access_group` (
  `GROUP_ID` int(10) NOT NULL AUTO_INCREMENT,
  `PARENT_GROUP_ID` int(10) DEFAULT NULL,
  `GROUP_NAME` varchar(45) NOT NULL,
  `GROUP_DESC` varchar(45) NOT NULL DEFAULT ''
) ENGINE=InnoDB AUTO_INCREMENT=66 DEFAULT CHARSET=latin1;

I'm open to using any standard command-line utility (e.g. sed, perl, awk), but since these files can be fairly large (some are on the order of tens or hundreds of GB) they need to be efficient.