vendredi 25 mars 2016

sqlite query with NATURAL JOIN produces extra records

I am trying to query my SQLite database in Python in hopes of returning the result as a Pandas DataFrame. However, something is going wrong in the query that causes an incorrect output with too many records. For example, if I use this query text:

query = "SELECT name, season, opponent, ratingA, ratingB  
    FROM players NATURAL JOIN games NATURAL JOIN A_ratings NATURAL JOIN B_ratings  
    WHERE (season="2015-16") AND (home_away="home") AND (tournament="tournX") AND (name="John Doe")"

The correct result should be two rows, one for each appearance by that player with those conditions. If you were to print the correct result from a cursor, it would look like this:

(u'John Doe', u'2015-16', u'TeamAlpha', 7.5, 8.0)
(u'John Doe', u'2015-16', u'TeamBeta', 6.0, 6.0)

Instead, I get the ratings for every player on the team for the games that match the conditions, with the name from the query replacing the correct name in the name column, like this:

(u'John Doe', u'2015-16', u'TeamAlpha', 7.5, 8.0)
(u'John Doe', u'2015-16', u'TeamAlpha', 8.5, 9.0)
(u'John Doe', u'2015-16', u'TeamAlpha', 6.5, 7.0)
(u'John Doe', u'2015-16', u'TeamAlpha', 6.5, 6.0)
(u'John Doe', u'2015-16', u'TeamAlpha', 7.0, 7.0))
(u'John Doe', u'2015-16', u'TeamBeta', 6.0, 6.0)
(u'John Doe', u'2015-16', u'TeamBeta', 8.0, 7.5)
(u'John Doe', u'2015-16', u'TeamBeta', 7.0, 7.0)
(u'John Doe', u'2015-16', u'TeamBeta', 7.5, 8.0)
(u'John Doe', u'2015-16', u'TeamBeta', 6.5, 7.0)

It seems there is some kind of multiplication going on between the tables, and I think the problem is in the NATURAL JOIN part but I can't figure out how to fix it.

I'm an SQL novice, obviously, but I'm stumped because the query works perfectly if I enter it into my DB Browser for SQLite.

The tables are structured like this:

CREATE TABLE "games" (
`gameID`    INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,
`season`    TEXT,
`tournament`    TEXT,
`home_away` TEXT,
`opponent`  TEXT,  )

CREATE TABLE "players" (
`playerID`  INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,
`name`  TEXT UNIQUE )

CREATE TABLE "A_ratings" (
`A_ratingID`    INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,
`playerID`  INTEGER,
`gameID`    INTEGER,
`ratingA`   REAL,
FOREIGN KEY(`playerID`) REFERENCES `players`(`playerID`),
FOREIGN KEY(`gameID`) REFERENCES games(gameID) )

CREATE TABLE "B_ratings" (
`B_ratingID`    INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,
`playerID`  INTEGER,
`gameID`    INTEGER,
`ratingB`   REAL,
FOREIGN KEY(`playerID`) REFERENCES `players`(`playerID`),
FOREIGN KEY(`gameID`) REFERENCES games(gameID) )

Here is the relevant pandas part, but the problem happens even when I just print the cursor results (as above).

cnxn = sqlite3.connect(path)
df = pd.read_sql(query, cnxn)

Aucun commentaire:

Enregistrer un commentaire