Question

RegEx: Two of (Newline + Whitespace) + alphanumeric. Can't get it!

I thought one of these would work, but none of them work. Starting with my favorite:

const gRecordDelim = /(\n[ \t]*){2,}(?=\w)/g;
// 2 or more ( newline + 0 or more (spaces or tabs) ) 
// 	+ any alphanumeric

//const gRecordDelim = /(\n[ \t]*\n)/g;
// const gRecordDelim = /(\n[ \t]*){2,}(?=[A-Za-z0-9])/g;
// const gRecordDelim = /(\n[ \t]*){2,}\n/g;
// const gRecordDelim = /\n{2,}/g;
// const gRecordDelim = /([ \t]*\n[ \t]*){2,}/g;

It’s a split pattern in javascript:

const recs = text.split(gRecordDelim);
recs.forEach((rec) => console.log("RECORD: " + rec))

Here’s my data.

const text = `
		Orblie Rapitulnik
		orbliek.jpg
		orbliek.com
		There are many variations of passages of Lorem Ipsum available, but the majority have suffered alteration in some form, by injected humour, or randomised words which don't look even slightly believable.
		
		Qang Le Toenthal
		Qang.jpg
		Qangle.io
		Contrary to popular belief, Lorem Ipsum is not simply random text.`

I’m getting the split at “Qang” as expected, but also splitting at the empty line, which i don’t want.

https://gcdnb.pbrd.co/images/r8W99gj1osuR.png


Submit an answer
Answer a question...

This textbox defaults to using Markdown to format your answer.

You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!

Sign In or Sign Up to Answer

These answers are provided by our Community. If you find them useful, show some love by clicking the heart. If you run into issues leave a comment, or add your own answer to help others.

You didn’t say what you want the result to be only what you didn’t want. :-)

I highly recommend the web site: https://regex101.com/ and https://regex101.com/quiz

Answer that works

/(?:\n[ \t]*){2,}(?=\w)/g;

After records are found, then fields can be split with:

const gFldDelim = /(?:\n[ \t]*)(?=\w)/g;
Bobby Iliev
Site Moderator
Site Moderator badge
December 4, 2022

Hi there,

I’ve just tested this out and the following seems to match what you are after:

const text = `
Orblie Rapitulnik
orbliek.jpg
orbliek.com
There are many variations of passages of Lorem Ipsum available, but the majority have suffered alteration in some form, by injected humour, or randomised words which don't look even slightly believable.

Qang Le Toenthal
Qang.jpg
Qangle.io
Contrary to popular belief, Lorem Ipsum is not simply random text.`

const gRecordDelim = /\n\s*\n/g;

const recs = text.split(gRecordDelim);
recs.forEach((rec) => console.log("RECORD: " + rec))

Output:

RECORD: Orblie Rapitulnik ...

RECORD: Qang Le Toenthal ...

Rundown of the regex:

  • \n: Matches a line feed char
  • \s: Matches whitespace characters
  • *: Match anything in between

Hope that this helps!

Best,

Bobby