ruby - Regex to get ID from link URL -

- April 15, 2014

i have links this:

<div class="zg_title">   <a href="http://rads.stackoverflow.com/amzn/click/b000o3gcfu">thermos foogo leak-proof stainless st...</a>      </div>

and i'm scraping them this:

  product_asin = product.xpath('//div[@class="zg_title"]/a/@href').first.value

the problem takes whole url , want id:

b000o3gcfu

i think need this:

product_asin = product.xpath('//div[@class="zg_title"]/a/@href').first.value[regex_here]

what's simplest regex can use in case?

edit:

strange link url doesn't appear complete:

http://www.amazon.com/thermos-foogo-leak-proof-stainless-10-ounce/dp/b000o3gcfu/ref=zg_bs_baby-products_1

use /\w+$/:

p doc.xpath('//div[@class="zg_title"]/a/@href').first.value[/\w+$/]

/\w+$/ matches trailing alphabets, digits, _.

require 'nokogiri'  s = <<eof <div class="zg_title">   <a href="http://rads.stackoverflow.com/amzn/click/b000o3gcfu">thermos foogo leak-proof stainless st...</a>      </div> eof  doc = nokogiri::html(s) p doc.xpath('//div[@class="zg_title"]/a/@href').first.value[/\w+$/] # => "b000o3gcfu"

Search This Blog

LAVA

ruby - Regex to get ID from link URL -

Comments

Post a Comment

Popular posts from this blog

c++ - Linked List error when inserting for the last time -

java - activate/deactivate sonar maven plugin by profile? -

tsql - Pivot with Temp Table (definition for column must include data type) -- SQL Server 2008 -