Last modified 23 months ago Last modified on 16/03/10 16:25:13

PostgreSQL (aka Postgres)

From the  PostgreSQL website:

"PostgreSQL is a powerful, open source relational database system. It has more than 15 years of active development and a proven architecture that has earned it a strong reputation for reliability, data integrity, and correctness. It runs on all major operating systems, including Linux, UNIX (AIX, BSD, HP-UX, SGI IRIX, Mac OS X, Solaris, Tru64), and Windows. It is fully ACID compliant, has full support for foreign keys, joins, views, triggers, and stored procedures (in multiple languages)."

"An enterprise class database, PostgreSQL boasts sophisticated features such as Multi-Version Concurrency Control (MVCC), point in time recovery, tablespaces, asynchronous replication, nested transactions (savepoints), online/hot backups, a sophisticated query planner/optimizer, and write ahead logging for fault tolerance."

Some useful postgres commands

Once postgres is installed there are a number of commands you will find useful:

In all of the following the default server is localhost use '-h servername' to specify a different server.

Creating/Dropping? databases

  • createdb to create a new database with a particular name. When you create the database you can use -E to specify the character encoding to use if different to the default. For better performance with InterMine you should use SQL_ASCII.
% createdb -E SQL_ASCII test

 Postgres documentation

  • dropdb to drop a database completely and remove the contents, this will fail if any connections are open to the database.
% dropdb test

 Postgres documentation

Running SQL commands

To run SQL commands from a shell use the psql program and type SQL followed by a ';'.

% psql test

In psql you can use:

  • \d - list tables and sequences
  • \dt - list tables only
  • \ds - list sequences only
  • \di - list indexes
  • \dv - list views
  • \l to list all databases
  • \q or control-d to exit

From the command line you can lists available databases with the '-l' flag.

% psql -l

        List of databases
   Name    |  Owner   | Encoding  
-----------+----------+-----------
 postgres  | postgres | SQL_ASCII
 template0 | postgres | SQL_ASCII
 template1 | postgres | SQL_ASCII
 test      | username | SQL_ASCII
(4 rows)

Character Set Encoding

We recommend using either SQL_ASCII or UTF-8. Theoretically, we should be using UTF-8, which is more correct, however its performance is rather poor, so we use SQL_ASCII.

The InterMine system stores all text in the database in UTF-8 format. If you set Postgres to LATIN-9, then Postgres will perform some incorrect conversions, and may even give an error. Setting the format to UTF-8 results in Postgres treating the text completely correctly, which is quite a complicated and slow operation in UTF-8.

If you set Postgres to SQL_ASCII, then that is a special character set in Postgres, which basically means "do no conversions". This is sufficient for almost all operations. All comparisons and index lookups will be done on a byte-by-byte basis, which is much faster than having to deal with Unicode's complications.

Please try to treat InterMine as a black box. The fact that it uses Postgres to store its data should be a detail that should be hidden as much as possible. The InterMine system is written in Java, and therefore handles all text in Unicode.


See:  Postgres documentation, Oracle