Scrapy: MySQL Pipeline -- Unexpected Errors Encountered -


i'm getting number of errors, depending upon being inserted/updated.

here code processing item:

 def process_item(self, item, spider):      try:         if 'producer' in item:             self.cursor.execute("""insert producers (title, producer) values (%s, %s)""", (item['title'], item['producer']))         elif 'actor' in item:             self.cursor.execute("""insert actors (title, actor) values (%s, %s)""", (item['title'], item['actor']))         elif 'director' in item:             self.cursor.execute("""insert directors (title, director) values (%s, %s)""", (item['title'], item['director']))         else:             self.cursor.execute("""update example_movie set distributor=%s, rating=%s, genre=%s, budget=%s title=%s""", (item['distributor'], item['rating'], item['genre'], item['budget'], item['title']))         self.conn.commit()     except mysqldb.error, e:         print "error %d: %s" % (e.args[0], e.args[1])      return item 

here example of items returned scraper:

 [{'budget': [u'n/a'], 'distributor': [u'lorimar'], 'genre': [u'action'], 'rating': [u'r'],'title': [u'action jackson']}, {'actor': u'craig t. nelson', 'title': [u'action jackson']}, {'actor': u'sharon stone', 'title': [u'action jackson']}, {'actor': u'carl weathers', 'title': [u'action jackson']}, {'producer': u'joel silver', 'title': [u'action jackson']}, {'director': u'craig r. baxley', 'title': [u'action jackson']}] 

here errors returned:

    2013-08-25 23:04:57-0500 [actorspider] error: error processing {'budget': [u'n/a'],  'distributor': [u'lorimar'],  'genre': [u'action'],  'rating': [u'r'],  'title': [u'action jackson']} traceback (most recent call last):   file "/library/python/2.7/site-packages/scrapy/middleware.py", line 62, in _process_chain     return process_chain(self.methods[methodname], obj, *args)   file "/library/python/2.7/site-packages/scrapy/utils/defer.py", line 65, in process_chain     d.callback(input)   file "/system/library/frameworks/python.framework/versions/2.7/extras/lib/python/twisted/internet/defer.py", line 361, in callback     self._startruncallbacks(result)   file "/system/library/frameworks/python.framework/versions/2.7/extras/lib/python/twisted/internet/defer.py", line 455, in _startruncallbacks     self._runcallbacks() --- <exception caught here> ---   file "/system/library/frameworks/python.framework/versions/2.7/extras/lib/python/twisted/internet/defer.py", line 542, in _runcallbacks     current.result = callback(current.result, *args, **kw)   file "/users/fortylashes/documents/management_work/boxofficemojo/boxofficemojo/pipelines.py", line 53, in process_item     self.cursor.execute("""update example_movie set distributor=%s, rating=%s, genre=%s, budget=%s title=%s""", (item['distributor'], item['rating'], item['genre'], item['budget'], item['title']))   file "/opt/local/library/frameworks/python.framework/versions/2.7/lib/python2.7/site-packages/mysqldb/cursors.py", line 159, in execute     query = query % db.literal(args) exceptions.valueerror: unsupported format character 's' (0x53) @ index 38     error 1064: have error in sql syntax; check manual corresponds mysql server version right syntax use near '), 'craig t. nelson')' @ line 1    2013-08-25 23:04:57-0500 [actorspider] debug: scraped <200 http://www.boxofficemojo.com/movies/?id=actionjackson.htm> {'actor': u'craig t. nelson', 'title': [u'action jackson']}    error 1064: have error in sql syntax; check manual corresponds mysql server version right syntax use near '), 'sharon stone')' @ line 1    2013-08-25 23:04:57-0500 [actorspider] debug: scraped <200 http://www.boxofficemojo.com/movies/?id=actionjackson.htm> {'actor': u'sharon stone', 'title': [u'action jackson']}    error 1064: have error in sql syntax; check manual corresponds mysql server version right syntax use near '), 'carl weathers')' @ line 1    2013-08-25 23:04:57-0500 [actorspider] debug: scraped <200 http://www.boxofficemojo.com/movies/?id=actionjackson.htm> {'actor': u'carl weathers', 'title': [u'action jackson']}   error 1064: have error in sql syntax; check manual corresponds mysql server version right syntax use near '), 'joel silver')' @ line 1   2013-08-25 23:04:57-0500 [actorspider] debug: scraped <200 http://www.boxofficemojo.com/movies/?id=actionjackson.htm> {'producer': u'joel silver', 'title': [u'action jackson']}   error 1064: have error in sql syntax; check manual corresponds mysql server version right syntax use near '), 'craig r. baxley')' @ line 1   2013-08-25 23:04:57-0500 [actorspider] debug: scraped <200 http://www.boxofficemojo.com/movies/?id=actionjackson.htm> {'director': u'craig r. baxley', 'title': [u'action jackson']} 

apparently there lot issues. thank reading! , suggestions or ideas appreciated!

::::update/more info::::

there appear 3 movies, of test set of 52 total, being inserted the actors, producers , directors tables. note: update statement isn't working @ all.

these movies are: abraham lincoln: vampire hunter, ace ventura: pet detective , ace ventura: when nature calls

interestingly, these of movies have : in title- i'm not sure means, if has idea please share it!

:::::insert solved:::::

turns out problem caused scraper putting individual items in list. {'actor': [u'this 1 guy'] opposed top {'actor': u'this 1 guy'}.

you have used wrong format specifier string data type @ line 53 of code. should small 's' not capital 's'.

self.cursor.execute("""update example_movie set distributor=%s, rating=%s, genre=%s, budget=%s title=%s""", (item['distributor'], item['rating'], item['genre'], item['budget'], item['title'])) 

it should this.

self.cursor.execute("""update example_movie set distributor=%s, rating=%s, genre=%s, budget=%s title=%s""", (item['distributor'], item['rating'], item['genre'], item['budget'], item['title'])) 

Comments

Popular posts from this blog

java - activate/deactivate sonar maven plugin by profile? -

python - TypeError: can only concatenate tuple (not "float") to tuple -

java - What is the difference between String. and String.this. ? -