mardi 13 janvier 2015

How to not take HTML scripts and only SQL scripts from a specific URL

I am attempting to grab a bunch of SQL scripts sitting in a URL accessible via only a LAN, and then execute these SQL scripts with Python SQLite, but have realized that in that same request to grab the SQL, I also grab some HTML and this hinders my Python SQLite command from executing. Thus I cannot create the database.


Here is the snippet of source code that does the request:



import requests

builds_range = range(1300, 1351)
print 'Getting data from the following URLs:'
for build in builds_range:
database_url = r'''http://ift.tt/1DEYokK'''.format(build_number = build)
print database_url
print requests.get(database_url).text
import_db(db_script_url = database_url, submit_cur = db_cursor,
parameter_parser = lambda parameter : parameter_parser_final_iteration_with_constraint(parameter, None),
build_number = build, extract_runtime = False)


And this is the snippet of code that actually does the execution of the SQL script to form the database:



import sqlite3

def import_db(db_script_url, submit_cur, parameter_parser, build_number, extract_runtime) :
conn = sqlite3.connect(':memory:')
conn.executescript(requests.get(db_script_url).text)
conn.commit()


Because of the fact that HTML gets in the way, you can see from my standard output with the help of print requests.get(database_url).text statement, that things look like:



<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://ift.tt/kkyg93">
<html>
<head><title>Log File contents</title>
<link rel="stylesheet" href="../../../../../../../default.css" type="text/css" />
</head>
<body class='log'>
<a href="data/text">(view as text)</a><br/>
<pre><span class="stdout">
BEGIN TRANSACTION;

create table highlevelsynthesis(
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT,
parameter TEXT,
rtl_output TEXT
);


and further down the text in the URL, I also have problems with encoding in displaying the single quote characters:



INSERT INTO
logfile(name, parameter, stdout, stderr, test_file, synthesis_config_file, status)
VALUES (&#39;dfmul&#39;, &#39;[[&#34;scheduler_type&#34;, &#34;sdc&#34;], [&#34;max_chain_delay&#34;, 0.9]]&#39;, &#39;/nfs/home/hongbin.zheng/buildslaves/253/Simulation/build/shang-build/tools/shang/testsuite/benchmark/legup_chstone/dfmul/20141210-103616-710585/hls.stdout&#39;, &#39;/nfs/home/hongbin.zheng/buildslaves/253/Simulation/build/shang-build/tools/shang/testsuite/benchmark/legup_chstone/dfmul/20141210-103616-710585/hls.stderr&#39;, &#39;/nfs/home/hongbin.zheng/buildslaves/253/Simulation/build/shang-build/tools/shang/testsuite/benchmark/legup_chstone/dfmul.bc&#39;, &#39;/nfs/home/hongbin.zheng/buildslaves/253/Simulation/build/shang-build/tools/shang/testsuite/benchmark/legup_chstone/dfmul/20141210-103616-710585/test_config.lua&#39;, &#39;passed&#39;);


What should actually be in there in place of &#39; is '.


As a result of this HTML, when conn.executescript(requests.get(db_script_url).text)is executed, I get the following error: OperationalError: near "<": syntax error.


What really puzzles me is that whenever I repeat this with a URL like: VIVADOJPEG500 = r'''http://ift.tt/1u5aI5a''' that also contains a chunk of SQL, I totally do not have this issue. I initially thought it had something to do with the r in front of the string, but after testing my URL with it, I realized that adding an r does not solve the issue.


Could someone who knows what is going on please help me to overcome this?


Thank you very much.


Aucun commentaire:

Enregistrer un commentaire