PostgreSQL (aka Postgres)
From the PostgreSQL website:
"PostgreSQL is a powerful, open source relational database system. It has more than 15 years of active development and a proven architecture that has earned it a strong reputation for reliability, data integrity, and correctness. It runs on all major operating systems, including Linux, UNIX (AIX, BSD, HP-UX, SGI IRIX, Mac OS X, Solaris, Tru64), and Windows. It is fully ACID compliant, has full support for foreign keys, joins, views, triggers, and stored procedures (in multiple languages)."
"An enterprise class database, PostgreSQL boasts sophisticated features such as Multi-Version Concurrency Control (MVCC), point in time recovery, tablespaces, asynchronous replication, nested transactions (savepoints), online/hot backups, a sophisticated query planner/optimizer, and write ahead logging for fault tolerance."
Some useful postgres commands
Once postgres is installed there are a number of commands you will find useful:
In all of the following the default server is localhost use '-h servername' to specify a different server.
Creating/Dropping? databases
- createdb to create a new database with a particular name. When you create the database you can use -E to specify the character encoding to use if different to the default. For better performance with InterMine you should use SQL_ASCII.
% createdb -E SQL_ASCII test
- dropdb to drop a database completely and remove the contents, this will fail if any connections are open to the database.
% dropdb test
Running SQL commands
To run SQL commands from a shell use the psql program and type SQL followed by a ';'.
% psql test
In psql you can use:
- \d - list tables and sequences
- \dt - list tables only
- \ds - list sequences only
- \di - list indexes
- \dv - list views
- \l to list all databases
- \q or control-d to exit
From the command line you can lists available databases with the '-l' flag.
% psql -l
List of databases
Name | Owner | Encoding
-----------+----------+-----------
postgres | postgres | SQL_ASCII
template0 | postgres | SQL_ASCII
template1 | postgres | SQL_ASCII
test | username | SQL_ASCII
(4 rows)
Character Set Encoding
We recommend using either SQL_ASCII or UTF-8. Theoretically, we should be using UTF-8, which is more correct, however its performance is rather poor, so we use SQL_ASCII.
The InterMine system stores all text in the database in UTF-8 format. If you set Postgres to LATIN-9, then Postgres will perform some incorrect conversions, and may even give an error. Setting the format to UTF-8 results in Postgres treating the text completely correctly, which is quite a complicated and slow operation in UTF-8.
If you set Postgres to SQL_ASCII, then that is a special character set in Postgres, which basically means "do no conversions". This is sufficient for almost all operations. All comparisons and index lookups will be done on a byte-by-byte basis, which is much faster than having to deal with Unicode's complications.
Please try to treat InterMine as a black box. The fact that it uses Postgres to store its data should be a detail that should be hidden as much as possible. The InterMine system is written in Java, and therefore handles all text in Unicode.
See: Postgres documentation, Oracle
