Научете регулярни изрази с този безплатен курс

„Някои хора, когато се сблъскат с проблем, си мислят„ знам, ще използвам регулярни изрази “. Сега те имат два проблема. " -Джейми Завински

За някои хора използването на регулярни изрази може да бъде проблем. Но не е задължително да е проблем за вас. Тази статия е пълен курс по регулярни изрази.

1. Въведение

Регулярните изрази или просто RegEx се използват в почти всички езици за програмиране, за да дефинират модел на търсене, който може да се използва за търсене на неща в низ.

Разработих безплатен пълен видео курс на Scrimba.com, за да преподавам основите на регулярните изрази.

Тази статия съдържа курса в писмена форма. Но ако предпочитате да гледате видео версията с интерактивни уроци, можете да я проверите на Scrimba. Разделите в тази статия съответстват на разделите в курса по Scimba.

Този курс следва заедно с учебната програма RegEx на freeCodeCamp.org. Можете да проверите това за предизвикателства при кодиране и да спечелите сертификат.

Тези уроци се фокусират върху използването на RegEx в JavaScript, но принципите се прилагат в много други езици за програмиране, които можете да изберете да използвате. Ако все още не знаете основния JavaScript, може да е полезно, ако първо го покриете малко. Имам и основен курс по JavaScript, до който можете да влезете в Scrimba и в YouTube канала freeCodeCamp.org.

Така че нека да започнем! Ще спестите деня за нула време. ?

2. Използване на тестовия метод

За да съпоставим части от низове с помощта на RegEx, трябва да създадем модели, които да ви помогнат да направите това съвпадение. Можем да посочим, че нещо е RegEx модел, като поставим шаблона между наклонени черти /, така /pattern-we-want-to-match/.

Нека разгледаме един пример:

// We want to check the following sentencelet sentence = "The dog chased the cat."
// and this is the pattern we want to match.let regex = /the/

Забележете как използваме, за /the/да покажем, че търсим „това“ в нашето sentence.

Можем да използваме test()метода RegEx , за да определим дали даден шаблон присъства в низ или не.

// String we want to testlet myString = "Hello, World!";
// Pattern we want to findlet myRegex = /Hello/;
// result is now truelet result = myRegex.test(myString);

3. Съвпадение на буквални струни

Нека сега намерим Уолдо.

let waldoIsHiding = "Somewhere Waldo is hiding in this text.";let waldoRegex = /Waldo/;
// test() returns true, so result is now also truelet result = waldoRegex.test(waldoIsHiding);

Обърнете внимание, че в този пример waldoRegexе чувствителен към малки и големи букви, така че ако трябва да пишем /waldo/с малки букви „w“, тогава resultще бъде невярно.

4. Съчетайте буквален низ с различни възможности

RegEx също има ORоператор, който е |символ.

let petString = "James has a pet cat.";
// We can now try to find if either of the words are in the sentencelet petRegex = /dog|cat|bird|fish/;
let result = petRegex.test(petString);

5. Игнорирайте регистъра, докато съвпадате

Досега разгледахме модели, когато случаят на буквите имаше значение. Как можем да направим нашите RegEx модели да не чувствителни към регистъра?

За да игнорираме случая, можем да го направим, като добавим iзнамето в края на шаблон, подобно на това /some-pattern/i.

let myString = "freeCodeCamp";
// We ignore case by using 'i' flaglet fccRegex = /freecodecamp/i;
// result is truelet result = fccRegex.test(myString);

6. Извличане на мачове

Когато искаме да извлечем съответната стойност, можем да използваме match()метод.

let extractStr = "Extract the word 'coding' from this string.";
let codingRegex = /coding/;
let result = extractStr.match(codingRegex);
console.log(result);
// Terminal will show: // > ["coding"]

7. Намерете повече от първия мач

Сега, когато знаем как да извлечем една стойност и също така е възможно да извлечем множество стойности с помощта на gфлага

let testStr = "Repeat, Repeat, Repeat";
let ourRegex = /Repeat/g;
testStr.match(ourRegex); // returns ["Repeat", "Repeat", "Repeat"]

Също така можем да комбинираме gфлага с iфлага, за да извлечем множество съвпадения и да игнорираме корпуса.

let twinkleStar = "Twinkle, twinkle, little star";
let starRegex = /twinkle/ig;// writing /twinkle/gi would have the same result.
let result = twinkleStar.match(starRegex);
console.log(result);
// Terminal will show: // > ["Twinkle", "twinkle"]

8. Съчетайте всичко с период на заместващи символи

В RegEx .има заместващ знак, който би съответствал на всичко.

let humStr = "I'll hum a song";
let hugStr = "Bear hug";
// Looks for anything with 3 characters beginning with 'hu'let huRegex = /hu./;
humStr.match(huRegex); // Returns ["hum"]
hugStr.match(huRegex); // Returns ["hug"]

9. Съчетайте единичен знак с множество възможности

Matching any character is nice, but what if we want to restrict the matching to a predefined set of characters? We can do by using [] inside our RegEx.

If we have /b[aiu]g/, it means that we can match ‘bag’, ‘big’ and ‘bug’.

If we want to extract all the vowels from a sentence, this is how we can do it using RegEx.

let quoteSample = "Beware of bugs in the above code; I have only proved it correct, not tried it.";
let vowelRegex = /[aeiou]/ig;
let result = quoteSample.match(vowelRegex);

10. Match Letters of the Alphabet

But what if we want to match a range of letters? Sure, let’s do that.

let quoteSample = "The quick brown fox jumps over the lazy dog.";
// We can match all the letters from 'a' to 'z', ignoring casing. let alphabetRegex = /[a-z]/ig;
let result = quoteSample.match(alphabetRegex);

11. Match Numbers and Letters of the Alphabet

Letters are good, but what if we also want numbers?

let quoteSample = "Blueberry 3.141592653s are delicious.";
// match numbers between 2 and 6 (both inclusive), // and letters between 'h' and 's'. let myRegex = /[2-6h-s]/ig;
let result = quoteSample.match(myRegex);

12. Match Single Characters Not Specified

Sometimes it’s easier to specify characters that you don’t want to watch. These are called ‘Negated Characters’ and in RegEx you can do it by using ^.

let quoteSample = "3 blind mice.";
// Match everything that is not a number or a vowel. let myRegex = /[^0-9aeiou]/ig;
let result = quoteSample.match(myRegex);// Returns [" ", "b", "l", "n", "d", " ", "m", "c", "."]

13. Match Characters that Occur One or More Times

If you want to match a characters that occurs one or more times, you can use +.

let difficultSpelling = "Mississippi";
let myRegex = /s+/g;
let result = difficultSpelling.match(myRegex);// Returns ["ss", "ss"]

14. Match Characters that Occur Zero or More Times

There is also a * RegEx quantifier. This one matches even 0 occurrences of a character. Why might this be useful? Most of the time it’s usually in combination with other characters. Let’s look at an example.

let soccerWord = "gooooooooal!";
let gPhrase = "gut feeling";
let oPhrase = "over the moon";
// We are trying to match 'g', 'go', 'goo', 'gooo' and so on. let goRegex = /go*/;
soccerWord.match(goRegex); // Returns ["goooooooo"]
gPhrase.match(goRegex); // Returns ["g"]
oPhrase.match(goRegex); // Returns null

15. Find Characters with Lazy Matching

Sometimes your pattern matches can have more than one outcome. For example, let’s say I’m looking for a pattern in a word titanic and my matched values must begin with a ‘t’ and end with an ‘i’. My possible results are ‘titani’ and ‘ti’.

This is why RegEx has a concepts of ‘Greedy Match’ and ‘Lazy Match’.

Greedy match finds thelongest possible match of the string that fits the RegEx pattern, this is a default RegEx match:

let string = "titanic";
let regex = /t[a-z]*i/;
string.match(regex);// Returns ["titani"]

Lazy match finds theshortest possible match of the string that fits the RegEx pattern and to use it we need to use ?:

let string = "titanic";
let regex = /t[a-z]*?i/;
string.match(regex);// Returns ["ti"]

16. Find One or More Criminals in a Hunt

Now let’s have a look at a RegEx challenge. We need to find all the criminals (‘C’) in a crowd. We know that they always stay together and you need to need to write a RegEx that would find them.

let crowd = 'P1P2P3P4P5P6CCCP7P8P9';
let reCriminals = /./; // Change this line
let matchedCriminals = crowd.match(reCriminals);

You can find me walking through the solution in this Scrimba cast.

17. Match Beginning String Patterns

RegEx also allows you to match patterns that are only at the beginning of a string. We’ve already talked about ^ creating a negating set. We can use the same symbol to find a match only at the beginning of a string.

let calAndRicky = "Cal and Ricky both like racing.";
// Match 'Cal' only if it's at the beginning of a string. let calRegex = /^Cal/;
let result = calRegex.test(calAndRicky); // Returns true
let rickyAndCal = "Ricky and Cal both like racing.";
let result = calRegex.test(rickyAndCal); // Returns false

18. Match Ending String Patterns

What about matching a pattern at the end of a string? We can use $ for that.

let caboose = "The last car on a train is the caboose";
// Match 'caboose' if it's at the end of a string.let lastRegex = /caboose$/;
let result = lastRegex.test(caboose); // Returns true

19. Match All Letters and Numbers

Earlier in parts 10 and 11 I showed you how we can match ranges of letters and numbers. If I asked you to write a RegEx that matches all the letters and numbers and ignore their cases you probably would have written something like /[a-z0-9]/gi and that’s exactly right. But it’s a bit too long.

RegEx has something called ‘Shorthand Character Classes’, which is basically a shorthand for common RegEx expression. For matching all letters and numbers we can use \w and we also get underscore _ matched as a bonus.

let quoteSample = "The five boxing wizards jump quickly.";
// Same as /[a-z0-9_]/gi to match a-z (ignore case), 0-9 and _let alphabetRegexV2 = /\w/g;
// The length of all the characters in a string// excluding spaces and the period. let result = quoteSample.match(alphabetRegexV2).length;
// Returns 31

20. Match Everything But Letters and Numbers

If we want to do the opposite and match everything that is not a letter or a number (also exclude underscore _), we can use \W

let quoteSample = "The five boxing wizards jump quickly.";
// Match spaces and the periodlet nonAlphabetRegex = /\W/g;
let result = quoteSample.match(nonAlphabetRegex).length;
// Returns 6

21. Match All Numbers

Ok, what about if you want only numbers? Is there a shorthand character class for that? Sure, it’s \d.

let numString = "Your sandwich will be $5.00";
// Match all the numberslet numRegex = /\d/g;
let result = numString.match(numRegex).length; // Returns 3

22. Match All Non-Numbers

Would you like the opposite and match all the non-numbers? Use \D

let numString = "Your sandwich will be $5.00";
// Match everything that is not a numberlet noNumRegex = /\D/g;
let result = numString.match(noNumRegex).length; // Returns 24

23. Restrict Possible Usernames

So far so good! Well done for making it this far. RegEx can be tricky as it’s not the most easily readable way to code. Let’s now look at a very real-life example and make a username validator. In this case you have 3 requirements:

  • If there are numbers, they must be at the end.
  • Letters can be lowercase and uppercase.
  • At least two characters long. Two-letter names can’t have numbers.

Try to solve this on your own and if you find it difficult or just want to check the answer, check out my solution.

24. Match Whitespace

Can we match all the whitespaces? Of course, we can use a shorthand for that too and it’s \s

let sample = "Whitespace is important in separating words";
// Match all the whitespaceslet countWhiteSpace = /\s/g;
let result = sample.match(countWhiteSpace);
// Returns [" ", " ", " ", " ", " "]

25. Match Non-Whitespace Characters

Can you guess how to match all non-whitespace characters? Well done, it’s \S!

let sample = "Whitespace is important in separating words";
// Match all non-whitespace characterslet countWhiteSpace = /\S/g;
let result = sample.match(countWhiteSpace);

26. Specify Upper and Lower Number of Matches

You can specify the lower and upper number of pattern matches with ‘Quantity Specifiers’. They can be used with {} syntax, for example {3,6}, where 3 is the lower bound and 6 is the upper bound to be matched.

let ohStr = "Ohhh no";
// We want to match 'Oh's that have 3-6 'h' characters in it. let ohRegex = /Oh{3,6} no/;
let result = ohRegex.test(ohStr); // Returns true

27. Specify Only the Lower Number of Matches

When we want to specify only the lower bound, we can do it by omitting the upper bound, for example to match at least three characters we can write {3,}. Notice that we still need a comma, even when we don’t specify the upper limit.

let haStr = "Hazzzzah";
// Match a pattern that contains at least for 'z' characterslet haRegex = /z{4,}/;
let result = haRegex.test(haStr); // Returns true

28. Specify Exact Number of Matches

In the previous section I mentioned that we need a comma in {3,} when we specify only the lower bound. The reason is when you write {3} without a comma, it means that you are looking to match exactly 3 characters.

let timStr = "Timmmmber";
// let timRegex = /Tim{4}ber/;
let result = timRegex.test(timStr); // Returns true

29. Check for All or None

There are times when you might want to specify a possible existence of a character in your pattern. When a letter or a number is optional and we would use ? for that.

// We want to match both British and American English spellings // of the word 'favourite'
let favWord_US = "favorite";let favWord_GB = "favourite";
// We match both 'favorite' and 'favourite' // by specifying that 'u' character is optionallet favRegex = /favou?rite/; // Change this line
let result1 = favRegex.test(favWord_US); // Returns truelet result2 = favRegex.test(favWord_GB); // Returns true

30. Positive and Negative Lookahead

Lookaheads’ are patterns that tell your JS to lookahead to check for patterns further along. They are useful when you’re trying to search for multiple patterns in the same strings. There 2 types of lookaheads — positive and negative.

Positive lookahead uses ?= syntax

let quit = "qu";
// We match 'q' only if it has 'u' after it. let quRegex= /q(?=u)/;
quit.match(quRegex); // Returns ["q"]

Negative lookahead uses ?! syntax

let noquit = "qt";
// We match 'q' only if there is no 'u' after it. let qRegex = /q(?!u)/;
noquit.match(qRegex); // Returns ["q"]

31. Reuse Patterns Using Capture Groups

Let’s imagine we need to capture a repeating pattern.

let repeatStr = "regex regex";
// We want to match letters followed by space and then letterslet repeatRegex = /(\w+)\s(\w+)/;
repeatRegex.test(repeatStr); // Returns true

Instead of repeating (\w+) at the end we can tell RegEx to repeat the pattern, by using \1. So the same as above can be written again as:

let repeatStr = "regex regex";
let repeatRegex = /(\w+)\s\1)/;
repeatRegex.test(repeatStr); // Returns true

32. Use Capture Groups to Search and Replace

When we find a match, it’s sometimes handy to replaced it with something else. We can use replace() method for that.

let wrongText = "The sky is silver.";
let silverRegex = /silver/;
wrongText.replace(silverRegex, "blue");
// Returns "The sky is blue."

33. Remove Whitespace from Start and End

Here’s a little challenge for you. Write a RegEx that would remove any whitespace around the string.

let hello = " Hello, World! ";
let wsRegex = /change/; // Change this line
let result = hello; // Change this line

If you get stuck or just want to check my solution, feel free to have a look at the Scrimba cast where I solve this challenge.

34. Conclusion

Congratulations! You have finished this course! If you’d like to keep learning more, feel free to checkout this YouTube playlist, that has a lot of JavaScript projects you can create.

Keep learning and thanks for reading!

You are now ready to play regex golf. ?