Friday, June 18, 2010

More RegEx Goodness

It's hard to believe that I have spent hours copying and pasting and manually reformatting text files when I had the power of regular expressions at my fingertips.  Oh, how true is the maximum "knowledge is power"

I had a need to create a text file with database names and key fields in a CSV format.  What I had was a text file with the database schema like this (I highlighted the parts of the text file I actually wanted.)
CREATE TABLE [dbo].[BuildCheck] (
[key] [uniqueidentifier] NOT NULL ,
[BuildStamp] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[version] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL
) ON [PRIMARY]
GO


CREATE TABLE [dbo].[dtproperties] (
[id] [int] NOT NULL ,
[objectid] [int] NULL ,
[property] [varchar] (64) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
[value] [varchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[uvalue] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[lvalue] [image] NULL ,
[version] [int] NOT NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO

CREATE TABLE [dbo].[tblCountyNames] (
[pkey] [uniqueidentifier] NOT NULL ,
[intCountyNum] [int] NULL ,
[txtCountyName] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[txtpassword] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL
) ON [PRIMARY]
GO
What I wanted was this.

"BuildCheck","key"
"dtproperties", "id"
"tblCountyNames", "pkey"



So I turned again to the hidden power of Regular Expressions.  I crafted the following RegEx line
^(CREATE TABLE *)([\d\w\s\]\[\. ]*)( *\(\r\s*)(\[\w*\])([\w,\[\]\(\)\r\s]*Go)
and then did a search and replace, replacing the schema with the second and fourth (one based) items.  I know it looks hopelessly complex, but it's not.

Item One
^(CREATE TABLE *)
Matches a "CREATE TABLE" command at the beginning of the line, and a chain of spaces (represented by the space character and the asterisk, which tells the engine to repeat the previous match)

Item Two
([\d\w\s\]\[\. ]*)
Matches a block of numbers (/d) words (/w) and whitespace (/s), beginning and ending brackets and periods.  This translates to the database name

Item Three
(\s*\(\r\s*)
Match some whitespace, then a left parenthesis, then a carriage return following by a string of whitespace.

Item Four
(\[\w*\])
Item Four is our key field name.  We're looking for an opening and closing bracket with word characters inside.

Item Five
([\w,\[\]\(\)\r\s]*GO)
Item five is everything else up and including to the word GO.

Then, I did a search for
(^\[dbo\]\.\[)(\w*)(\]\s)(,)(\[)(\w*)(\])
Replacing it with
"\2","\6"
and viola!

No comments: