Daniel, Tobias, Renato and myself have been looking a little bit at the potential underlying reason
for why http://llvm.org/perf/ is instable, and have found some clues. I want to share them here
to give people with more experience in the frameworks used by LNT (flask, sqlalchemy, wsgi, …)
a chance to check if our reasoning below seems plausible.
Daniel noticed the following backtrace in the log after http://llvm.org/perf started giving “Internal Server Error”
again:
2015-05-08 22:57:05,309 ERROR: Exception on /db_default/v4/nts/287/graph [GET] [in /opt/venv/perf/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py:1423]
Traceback (most recent call last):
File “/opt/venv/perf/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py”, line 1817, in wsgi_app
response = self.full_dispatch_request()
File “/opt/venv/perf/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py”, line 1477, in full_dispatch_request
rv = self.handle_user_exception(e)
File “/opt/venv/perf/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py”, line 1381, in handle_user_exception
reraise(exc_type, exc_value, tb)
File “/opt/venv/perf/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py”, line 1475, in full_dispatch_request
rv = self.dispatch_request()
File “/opt/venv/perf/lib/python2.7/site-packages/Flask-0.10.1-py2.7.egg/flask/app.py”, line 1461, in dispatch_request
return self.view_functionsrule.endpoint
File “/opt/venv/perf/lib/python2.7/site-packages/LNT-0.4.1dev-py2.7.egg/lnt/server/ui/decorators.py”, line 67, in wrap
result = f(**args)
File “/opt/venv/perf/lib/python2.7/site-packages/LNT-0.4.1dev-py2.7.egg/lnt/server/ui/views.py”, line 385, in v4_run_graph
ts = request.get_testsuite()
File “/opt/venv/perf/lib/python2.7/site-packages/LNT-0.4.1dev-py2.7.egg/lnt/server/ui/app.py”, line 76, in get_testsuite
testsuites = self.get_db().testsuite
File “/opt/venv/perf/lib/python2.7/site-packages/LNT-0.4.1dev-py2.7.egg/lnt/server/ui/app.py”, line 55, in get_db
self.db = current_app.old_config.get_database(g.db_name, echo=echo)
File “/opt/venv/perf/lib/python2.7/site-packages/LNT-0.4.1dev-py2.7.egg/lnt/server/config.py”, line 148, in get_database
return lnt.server.db.v4db.V4DB(db_entry.path, self, echo=echo)
File “/opt/venv/perf/lib/python2.7/site-packages/LNT-0.4.1dev-py2.7.egg/lnt/server/db/v4db.py”, line 108, in init
.filter_by(id = lnt.testing.PASS).first()
File “/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlalchemy/orm/query.py”, line 2334, in first
ret = list(self[0:1])
File “/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlalchemy/orm/query.py”, line 2201, in getitem
return list(res)
File “/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlalchemy/orm/query.py”, line 2405, in iter
return self._execute_and_instances(context)
File “/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlalchemy/orm/query.py”, line 2418, in _execute_and_instances
close_with_result=True)
File “/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlalchemy/orm/query.py”, line 2409, in _connection_from_session
**kw)
File “/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlalchemy/orm/session.py”, line 846, in connection
close_with_result=close_with_result)
File “/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlalchemy/orm/session.py”, line 850, in _connection_for_bind
return self.transaction._connection_for_bind(engine)
File “/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlalchemy/orm/session.py”, line 315, in _connection_for_bind
conn = bind.contextual_connect()
File “/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlalchemy/engine/base.py”, line 1737, in contextual_connect
self.pool.connect(),
File “/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlalchemy/pool.py”, line 332, in connect
return _ConnectionFairy._checkout(self)
File “/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlalchemy/pool.py”, line 630, in _checkout
fairy = _ConnectionRecord.checkout(pool)
File “/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlalchemy/pool.py”, line 433, in checkout
rec = pool._do_get()
File “/opt/venv/perf/lib/python2.7/site-packages/SQLAlchemy-0.9.6-py2.7.egg/sqlalchemy/pool.py”, line 945, in _do_get
(self.size(), self.overflow(), self._timeout))
TimeoutError: QueuePool limit of size 5 overflow 10 reached, connection timed out, timeout 30
After browsing through the sqlalchemy documentation and bits of the LNT implementation,
it seems so far that the following pieces may be the key parts that cause the problem
shown in the log.
The SQLAlchemy documentation seems to recommend to have a sqlalchemy session per web
request. Looking at the following pieces of LNT, I got the impression that instead a
session is shared between many or all requests:
From ui/app.py, it shows Request.get_db() basically caches get_database from “config”:
…
class Request(flask.Request):
…
def get_db(self):
…
if self.db is None:
echo = bool(self.args.get(‘db_log’) or self.form.get(‘db_log’))
self.db = current_app.old_config.get_database(g.db_name, echo=echo)
…
return self.db
in config.py, it is shown that get_database returns a V4DB object by calling a constructor:
…
def get_database(self, name, echo=False):
…
Instantiate the appropriate database version.
if db_entry.db_version == ‘0.4’:
return lnt.server.db.v4db.V4DB(db_entry.path, self,
db_entry.baseline_revision,
echo)
…
This constructor is in db/v4db.py:
…
class V4DB(object):
…
def init(self, path, config, baseline_revision=0, echo=False):
…
self.session = sqlalchemy.orm.sessionmaker(self.engine)()
…
Add several shortcut aliases.
self.add = self.session.add
self.commit = self.session.commit
self.query = self.session.query
self.rollback = self.session.rollback
…
It seems like a single session object is created in this constructor that will ultimately
be shared across all Requests. It seems that instead, the request.get_db method should
create a new session for each request. And close that session when the request is finalized
which probably needs to be done by hooking into something Flask-specific.
The self.add and following lines in the constructor show that it probably will be
non-trivial to refactor code so that there will not be a single session per v4db object.
We’re not sure if making separate sessions per Request is going to solve the http://llvm.org/perf
instability problems; but that’s the best idea we’ve got so far.
Thanks,
Kristof