vendredi 4 mars 2016

SQLite Corruption By Writing Zero-valued Bytes Data

I am debugging db corruption. After I get some corrupted db, I found that they all corrupted by writing zero-valued bytes.

So, I decide to add some check and dump call stackin the source code in order to find out who corrupts the db.

Here is the code I added in the source code.

int sqlite3CheckZeroValuedBytes(const unsigned char* data, const int length)
{
  const size_t* s = (const size_t*)data;
  const unsigned char* d = (const unsigned char*)data;
  int n = length/sizeof(size_t);
  int i;
  for (i = 0; i  n; i++) {
    if (s[i]!=0) {
      return 0;
    }
  }
  for (i = i*sizeof(size_t); ilength; i++) {
    if (d[i]!=0) {
      return 0;
    }
  }
  return 1;
}
static int unixWrite(
 sqlite3_file *id,
 const void *pBuf,
 int amt,
 sqlite3_int64 offset
){
 unixFile *pFile = (unixFile*)id;
 if (amt>0&&sqlite3CheckZeroValuedBytes(pBuf, amt)) {
  SQLITE_KNOWN_ERROR(SQLITE_CORRUPT, "writing zero-valued bytes into %s from %d length %d", unixGetFilename(pFile-zPath), offset, amt);
 }
...
}

The code is simple. I check the data whether is all null in [sqlite3CheckNullData], and add a macro [SQLITE_KNOWN_ERROR], which is defined as [sqlite_log], to throw this error outside SQLite. Outside SQLite, I dump the call stack of all thread, and I got this:

0x195774000 + 113628   objc_msgSend (in libobjc.dylib) + 28
0x1000f8000 + 7781724   _ZL9LogSQLitePviPKc,WCDataBase.mm,line 81
0x1000f8000 + 2836888   sqlite3_vlog,printf.c,line 1023
0x1000f8000 + 2778664   sqlite3KnownError,main.c,line 3192
0x1000f8000 + 2554560   unixWrite,os_unix.c,line 3335
0x1000f8000 + 2821984   sqlite3WalCheckpoint,wal.c,line 1798
0x1000f8000 + 2819864   sqlite3WalClose,wal.c,line 1914
0x1000f8000 + 2529964   sqlite3PagerClose,pager.c,line 3995
0x1000f8000 + 2574152   sqlite3BtreeClose,btree.c,line 2516
0x1000f8000 + 2774444   sqlite3LeaveMutexAndCloseZombie,main.c,line 10834297741736
0x1000f8000 + 2774220   sqlite3Close,main.c,line 1026

This is the only thread operating database. All other call stack of threads make no sense. You can see the SQLite checkpointing. That is the reason why my database corrupt. And I have no idea how this happened even I checking the source code.

Here is some of my conclusion:

  1. This checking zero-valued bytes also work for writing into WAL file, but there is no report that WAL is been written by zero-valued bytes. 2.Some rogue file descriptor may write the zero-valued bytes into WAL file. But, I have several db with the same problem. It’s a rare event that the rogue writter only write the zero-valued bytes into the WAL, not all other db files or normal files.
  2. I guess it could be a problem of operating system. I work on iOS, but I have no any further idea.
  3. It would happened in normal knee. But it could easily happen when the disk free space is low. I also have no any further idea about this.

So, this is my confusion:

  1. Does anyone have any idea about this?
  2. What can I do to reserve this type of corruption?

Note that if a page of sqlite_master is been rewritten by zero-valued bytes, the [.dump] shell command will not work to repair the database.


Aucun commentaire:

Enregistrer un commentaire