lundi 27 avril 2015

regular expressions with word boundaries fail in .NET Regex


I have a problem with regular expressions in .NET. We are trying to match special tokens in a text surrounded by the paragraph sign (§). For completeness those the corresponding regular expressions are surrounded by word boundaries (\b). The problem is that the regular expression surrounded by \b does not match words:

    static void Main(string[] args)
    {
        string data = "I would like to replace this §pattern§ with something interesting";
        string requiredResult = "I would like to replace this serious text with something interesting";

        Regex regSuccess = new Regex("§pattern§");
        Regex regFail = new Regex(@"\b§pattern§\b");

        var dataSuccess = regSuccess.Replace(data, "serious text");
        var dataFail = regFail.Replace(data, "serious text");

        Console.WriteLine("regSuccess match: {0}", dataSuccess == requiredResult);
        Console.WriteLine("regFail match: {0}", dataFail == requiredResult);
        Console.WriteLine("Press enter to continue");
        var line = Console.ReadLine();
    }

As you can see, dataFail == requiredResult returns false.


Aucun commentaire:

Enregistrer un commentaire