Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

teek

macrumors member
Original poster
Feb 12, 2008
88
0
Norway
I'm trying to parse a few <input ../> tags but this regex only work for input tags that are on separate lines:
NSString *html = @"html with multiple lines containing a few lines with input elements only and then a few lines with multiple input elemets";

NSString *regex = @"(?i:<input.*name=\"(.*))\".*value=\"(.*)\".*/>";
NSArray *names = [html componentsMatchedByRegex:regex capture:1];
NSArray *values = [html componentsMatchedByRegex:regex capture:2];

This is working fine but it does NOT work on the lines that consist of multiple <input .../> elements.

What's wrong with my regex ? Also, This regex won't work if the name and value attributes are not in correct order.. I wan't it to work regardless of order and if there's other attributes specified.:


<input id="foo" value="bar" name="something"/>
<input value="foo" id="y" name="bar" other_attribute="x"/>

Can someone please help with this regex ? I've looked at the docs but I can't figure it out.
 
Think about what the last .* is doing. It's matching any character, therefore also matching the whole next <input.../> tag. Either use something more specific than .* (\s for whitespace looks like what you want) or add a ? to the end of the .* to force that .* match as little as possible.

As for matching arbitrary numbers of attributes, you could wrap that section in a non-capture group and + or * it. Something like this (note I haven't tested it):

...input\\s*(?:\\w+=\"(.*)\"\\s*)+...

Also, take a look at the ICU user guide (what RegexKitLite uses as its backend):

http://userguide.icu-project.org/strings/regexp
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.