How to extract all number occurrences in a string with REGEX

Post your questions and help other users.

Moderator: Martin

Post Reply
rcfree
Posts: 43
Joined: 21 Mar 2013 16:51

How to extract all number occurrences in a string with REGEX

Post by rcfree » 10 Dec 2016 17:43

Hi,

How to extract all number occurrences in a string using REGEX?

I can select occurrences of numbers in a text using (\d+[.,]\d*?), but I can not create a list with all the numbers.

=>STRING: ABCDEF 1245.00 - xyztusd 98.72 mgjduc 2450.78 euwiso 87.25 ansjcy 56.12.

=>PROBLEM: How to create a list containing only the text numbers as below using REGEX:

List [0] = 1245.00
List [1] = 98.72
List [2] = 2450,78
List [0] = 87.25
List [0] = 56.12

Thanks for the help!

User avatar
Martin
Posts: 4468
Joined: 09 Nov 2012 14:23

Re: How to extract all number occurrences in a string with R

Post by Martin » 10 Dec 2016 20:40

Hi,

To my knowledge it's not possible to extract all numbers with a single command.
Following script might work:

Code: Select all

s="ABCDEF 1245.00 - xyztusd 98.72 mgjduc 2450.78 euwiso 87.25 ansjcy 56.12";
words=split(s, "\\s");
numbers=newList();
for(w in words)
{
  w=trim(w);
  if(matches(w, "\\d+[.,]\\d*?")) addElement(numbers, w);
}
Regards,
Martin

bogdyro
Posts: 241
Joined: 04 Apr 2015 15:14

Re: How to extract all number occurrences in a string with R

Post by bogdyro » 11 Dec 2016 08:27

Hi. There's something fishy going on with the regex engine but I can't put my finger on it.
Anyway, this regular expression works just fine in .NET (\d{1,}.\d{1,}) . It matches all numbers, each number is in it's own group.
See here http://regexstorm.net/tester?p=%28%5cd% ... +STOP.&o=e

In Automagic it's a NO GO.
Now, android uses JAVA regex engine and after searching alot on the web I did find an online tester which doesn't match the string. Assuming that it's the same reason it doesn't work in Automagic, there might be a solution.
http://www.regexplanet.com/advanced/java/index.html
Testing (\d{1,}.\d{1,}) on string ABCDEF 1245.00 - xyztusd 98.72 mgjduc 2450.78 euwiso 87.25 ansjcy 56.12. gives a false matches() as in automagic. However, the find() function gets all the numbers corectly. I think it's already implemented in the Automagic regex tester as the numbers are highlighted there also. So implementing this function in the script would be great.
Anyway, I don't know why the java implementation needs to match ALL characters in a string for match() to be true, I will investigate. I'm actually quite displeased that there is such a big difference betweend the two implementations.

bogdyro
Posts: 241
Joined: 04 Apr 2015 15:14

Re: How to extract all number occurrences in a string with R

Post by bogdyro » 11 Dec 2016 08:37

So, after searching again, I found this: http://stackoverflow.com/questions/1428 ... va-and-net
It seems like there's a confusion in the naming of the functions in java. The matches() function actually means macthes all and the find() function is equivalent to the matches() function in .NET meaning that it can match also parts of the strings.
So I think there's some changes that need to happen in automagic to revolve around the find() function for the regex engine to work as expected.
Thanks Martin.

rcfree
Posts: 43
Joined: 21 Mar 2013 16:51

Re: How to extract all number occurrences in a string with R

Post by rcfree » 11 Dec 2016 11:43

Thanks for the help guys.

I did some research here too and it seems to me that it is possible to do what I want with Java, as android is basically Java, I believe it is only necessary to implement in the Automagic function.

Here are some links I found:

http://www.java2novice.com/java-collect ... t-capture/

http://www.java2s.com/Code/Java/Regular ... azAZ09.htm

It would be very interesting if it were implemented in Automagic, making it possible to extract any information within any string returning a list with the extracted parts.

User avatar
Martin
Posts: 4468
Joined: 09 Nov 2012 14:23

Re: How to extract all number occurrences in a string with R

Post by Martin » 13 Dec 2016 19:49

That's correct, matches has to match the entire input. I'm aware about find, just have not found a good way to integrate it. I could add a function like findAll that returns a list of all matched groups (or a list of lists in case there are multiple groups in the regex).

Regards,
Martin

bogdyro
Posts: 241
Joined: 04 Apr 2015 15:14

Re: How to extract all number occurrences in a string with R

Post by bogdyro » 14 Dec 2016 06:22

Yes,that would work nicely Martin. Also if it's possible to add it to the tester screen.

rcfree
Posts: 43
Joined: 21 Mar 2013 16:51

Re: How to extract all number occurrences in a string with R

Post by rcfree » 14 Dec 2016 10:59

It will be very useful.

Post Reply