More handy #Scrivener #indiepub #regex

I’m trying to come up with a set of easy mistake catchers that will do a lot of the more mindless editorial work for me. Having stared at the same 100,000 words for about six months, I’m losing the ability to see the errors. Here is another short set of Scrivener Regular Expression (regex) patterns that will find easy-to-fix problems. (Look at yesterday’s post on #regex to get more information on the definitions I’m using here. As a note, \w means word characters, ^ means ‘start of the line’ and [ ] is used to allow a range of characters to match. I’ll add some more as I go.)

  • \w”
    • This is looking for a single “word” character, followed by closing quotation marks. Scrivener usually replaces straight quotes with opening and closing quotation marks. It’s important to use the right one but you can look for both by modifying the pattern to \w[“”].
    • This should catch every time you’ve written speech and missed the punctuation at the end.
  • ^[““][a-z]
    • You have to switch on case-sensitivity in the search to get this one to work but it will start at the beginning of lines (^) and look for opening or straight quotes that are followed by lower-case letters.
    • This will catch sentences such as “the fish sang.” and “dang it all to heck, man!” but it will also pick up any deliberate use of the lower case, such as “iBooks is a distribution platform.”
  • \w,[””]$
    • Now we have a word character (A-Z, a-z, 0-9, and _) followed by a comma. After this, we’re looking for closing quotes. But what about the $? That means “this has to be at the end of a line”. It’s the companion to ^.
    • This will show us any time that we used a comma as the final punctuation inside quotation marks and then started a new line.
    • This will find

      “This is a terrible thing,”

      but it won’t find

      “This is a terrible thing,” said Charlie, “What are we to do?”

      because the ,” isn’t at the end of a line.

  • \s\d\s
    • This will find any digits that are sitting around by themselves. If you, like me, would prefer to write most of your numbers as words but keep forgetting to do it, this pattern is for you!
    • \d means any digit from 0-9. The spaces are around it to stop the pattern matching digits that are part of bigger numbers. (\d by itself would match the 1, the 9, the 3 and the 2 in 1932 but as individual matches.)
    • This will find patterns such “He said that there were 3 beasts”.
    • Want to find longer numbers? \s\d{2}\s will find all numbers that are two digits long and are surrounded by spaces.
    • Want to find all numbers between 1 and 4 digits long? \s\d{1,4}\s is the pattern and will find numbers that are 1, 2, 3, or 4 digits long.
  • (\w)(\w)(\w)\3\2\1
    • This one is just for fun. It finds palindrome patterns that are six characters long. In my text, “sniffing” and “suffused” are matches.
    • Adding a space character in the middle give us (\w)(\w)(\w) \3\2\1 and this will match “sword dropped” and “now wonder”.
    • Really rather useless unless you have a habit of writing palindromic text and wish to stop.

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s