0

I'm scraping reddit using praw and storing records in a pandas df. Using a combination of sqlalchemy & pymysql to connect to my AWS RDS db and to_sql to append records to an existing table. All seems to be working fine until I hit the to_sql method. It throws the following errors and i'm not really sure where to go from here. Any help or suggestions would be awesome!

engine = sqlalchemy.create_engine('mysql+pymysql://username:[email protected]:3306/socialdata')
df_comment = pd.DataFrame(comment_table)

df_comment.to_sql(name='reddit_comments', con=engine, index=False, if_exists='append')
Traceback (most recent call last):
  File "/Users/ty/Desktop/Python/reddit_scraper.py", line 121, in <module>
    df_comment.to_sql(name='reddit_comments', con=engine, index=False, if_exists='append')
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/pandas/core/generic.py", line 2605, in to_sql
    sql.to_sql(
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/pandas/io/sql.py", line 589, in to_sql
    pandas_sql.to_sql(
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/pandas/io/sql.py", line 1398, in to_sql
    table.insert(chunksize, method=method)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/pandas/io/sql.py", line 830, in insert
    exec_insert(conn, keys, chunk_iter)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/pandas/io/sql.py", line 747, in _execute_insert
    conn.execute(self.table.insert(), data)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1011, in execute
    return meth(self, multiparams, params)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/sqlalchemy/sql/elements.py", line 298, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1124, in _execute_clauseelement
    ret = self._execute_context(
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1316, in _execute_context
    self._handle_dbapi_exception(
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1514, in _handle_dbapi_exception
    util.raise_(exc_info[1], with_traceback=exc_info[2])
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
    raise exception
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1256, in _execute_context
    self.dialect.do_executemany(
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/sqlalchemy/dialects/mysql/mysqldb.py", line 148, in do_executemany
    rowcount = cursor.executemany(statement, parameters)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/pymysql/cursors.py", line 188, in executemany
    return self._do_execute_many(q_prefix, q_values, q_postfix, args,
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/pymysql/cursors.py", line 206, in _do_execute_many
    v = values % escape(next(args), conn)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/pymysql/cursors.py", line 120, in _escape_args
    return {key: conn.literal(val) for (key, val) in args.items()}
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/pymysql/cursors.py", line 120, in <dictcomp>
    return {key: conn.literal(val) for (key, val) in args.items()}
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/pymysql/connections.py", line 469, in literal
    return self.escape(obj, self.encoders)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/pymysql/connections.py", line 462, in escape
    return converters.escape_item(obj, self.charset, mapping=mapping)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/pymysql/converters.py", line 27, in escape_item
    val = encoder(val, mapping)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/pymysql/converters.py", line 123, in escape_unicode
    return u"'%s'" % _escape_unicode(value)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/pymysql/converters.py", line 78, in _escape_unicode
    return value.translate(_escape_table)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/praw/models/reddit/base.py", line 35, in __getattr__
    return getattr(self, attribute)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/praw/models/reddit/base.py", line 36, in __getattr__
    raise AttributeError(
AttributeError: 'Redditor' object has no attribute 'translate'

1 Answer 1

1

One of the columns in your DataFrame contains a custom "Redditor" object which doesn't map to a corresponding SQL datatype. pymysql calls the object's translate function when it isn't something obvious like int float or string

If Redditor is just a wrapper object for a store of usernames and other metadata, then you can do something like remapping that column to the string / number representation of the Redditor object. If it is an object you've defined, you can add a translate() function to the Redditor class's definition to return the appropriate value. For example if Redditor.id contains the value that you want to store in the column :-

class Redditor():
  def translate(self):
    # Change self.id with the value you care about
    return self.id 

or in pandas before you save

df[REDDITOR_COLUMN] = df[REDDITOR_COLUMN].apply(lambda x: x.id)

Sign up to request clarification or add additional context in comments.

1 Comment

hey you're absolutely right I got ahead of myself for a second without checking, editing the post accordingly, thanks!

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.