ruby - Regex to get ID from link URL -


i have links this:

<div class="zg_title">   <a href="http://rads.stackoverflow.com/amzn/click/b000o3gcfu">thermos foogo leak-proof stainless st...</a>      </div> 

and i'm scraping them this:

  product_asin = product.xpath('//div[@class="zg_title"]/a/@href').first.value  

the problem takes whole url , want id:

b000o3gcfu 

i think need this:

product_asin = product.xpath('//div[@class="zg_title"]/a/@href').first.value[regex_here] 

what's simplest regex can use in case?

edit:

strange link url doesn't appear complete:

http://www.amazon.com/thermos-foogo-leak-proof-stainless-10-ounce/dp/b000o3gcfu/ref=zg_bs_baby-products_1 

use /\w+$/:

p doc.xpath('//div[@class="zg_title"]/a/@href').first.value[/\w+$/] 

/\w+$/ matches trailing alphabets, digits, _.


require 'nokogiri'  s = <<eof <div class="zg_title">   <a href="http://rads.stackoverflow.com/amzn/click/b000o3gcfu">thermos foogo leak-proof stainless st...</a>      </div> eof  doc = nokogiri::html(s) p doc.xpath('//div[@class="zg_title"]/a/@href').first.value[/\w+$/] # => "b000o3gcfu" 

Comments

Popular posts from this blog

c++ - Linked List error when inserting for the last time -

java - activate/deactivate sonar maven plugin by profile? -

java - What is the difference between String. and String.this. ? -