PostgreSQL table partitioning and flow-export

To export data aggregated by flow-capture to Postgresql, similar table structure must be created:

After this, flow-capture must be launched as following:

#file: /etc/flow-tools/flow-capture.conf
-w /var/flowdata -n 95  -N 3 -R /usr/local/bin/flowdata-export.sh 0/0/555

This means that it will run /usr/local/bin/flowdata-export.sh script every time it rotates its log file. With the first argument as the flow file name after rotating. Here's is the simple script:

#!/bin/sh

LOGS=/var/flowdata
DB=dbuser:dbpass:dbhost:dbport:dbname:dbtable

if [ $# -lt 1 ]; then
    echo "USE: $0 <filename>"
    exit
fi

flow-export -f 5 -m UNIX_SECS,DOCTETS,SRCADDR,DSTADDR,SRCPORT,DSTPORT,PROT -u $DB < $LOGS/$1

There can be a lot of data in db, and it could be wise to split it into smaller chunks. This can be done using postgresql table partitioning. Here's how we can improve our database/table setup:

After this, there will be a dedicated table for every month. Data can be insterted into flowdata table as before (as well as selected), though in fact it will reside in its corresponding table. Such structure is good for keeping logs for the last year. All we need to do is to flush current month table every first day of the month. Thus we'll always have data for the last 12 monthes.

PowerDNS, gmysql backend and Chicken and Egg Problem

Yesterday I had a hard time making PowerDNS (pdns) start working using gmysql backend. Like I figured out later, this was a «Chicken and Egg problem». To start working, pdns must connect to mysql, to send its queries. But mysql needs DNS to resolve pdns' ip in order to authenticate it.

To avoid such situations in future, i added skip-name-resolve into /etc/my.cnf and substituted all hostnames with corresponding ip's in mysq.user table and some config files.

Setup of libpam-mysql, libnss-mysql-bg with Django admin application

When you have many servers to manage, it can be usefull to have some centrelized server for user authentication and accounting. Libpam-mysql and libnss-mysql-bg permit keep your data in mysql db and make it possible to use it by pam-aware services on your system.

From the other hand, it is goot to have some web interface to manage all these records. I decided to use excellent Admin application from Django framework for this purpose

Here are the simple steps:

$ django-admin startproject webadmin
$ cd webadmin
$ ./manage.py startapp pam

After this, costumize your settings.py and add similar models.py to the pam/ folder.

To prevent system and mysql uids/gids overlapping, it's probably a good idea to set Mysql sequence numbers to some reasonable value, like 2000 or similar.

# apt-get install libnss-mysql-bg libpap-mysql After this, make you config files look similar to this:

# /etc/libnss-mysql.cfg
getpwnam    SELECT username,'x',uid,gid,gecos,homedir,shell \
            FROM pam_user \
            WHERE username='%1$s' AND is_active=1 \
            LIMIT 1
getpwuid    SELECT username,'x',uid,gid,gecos,homedir,shell \
            FROM pam_user \
            WHERE uid='%1$u' \
            LIMIT 1
getspnam    SELECT username,password,lstchg,min,max,warn,inact,expire,flag \
            FROM pam_user \
            WHERE username='%1$s' \
            LIMIT 1
getpwent    SELECT username,'x',uid,gid,gecos,homedir,shell \
            FROM pam_user
getspent    SELECT username,password,lstchg,min,max,warn,inact,expire,flag \
            FROM pam_user
getgrnam    SELECT name,password,gid \
            FROM pam_group \
            WHERE name='%1$s' \
            LIMIT 1
getgrgid    SELECT name,password,gid \
            FROM pam_group \
            WHERE gid='%1$u' \
            LIMIT 1
getgrent    SELECT name,password,gid \
            FROM pam_group
memsbygid   SELECT pam_user.username \
            FROM pam_user,pam_user_groups \
            WHERE pam_user_groups.group_id='%1$u' AND pam_user.uid=pam_user_groups.user_id
gidsbymem   SELECT pam_user_groups.group_id \
            FROM pam_user,pam_user_groups \
            WHERE pam_user.username='%1$s' AND pam_user.uid=pam_user_groups.user_id

host        192.168.1.1
database    webadmin
username   webadmin
password    secret
timeout      3
compress    0
# /etc/libnss-mysql-root.cfg
username    nss-root
password    s3cr3t

After this, you need to add some data into db using Django admin interface, and tweak pam configuration. In Debian/Ubuntu this can be done like this:

# /etc/nsswitch.conf
passwd:         compat mysql
group:          compat mysql
shadow:         compat mysql

[...]
# /etc/pam.d/common-auth

auth    [success=1 default=ignore]      pam_unix.so
auth    required                        pam_mysql.so user=webadmin passwd=secret host=192.168.1.1 db=webadmin table=pam_user usercolumn=pam_user.username passwdcolumn=pam_user.password crypt=1 [where=pam_user.is_active="1"]
auth    required                        pam_permit.so
# /etc/pam.d/common-account

account [success=1 default=ignore]      pam_unix.so
account required                        pam_mysql.so user=webadmin passwd=secret host=192.168.1.1 db=webadmin table=pam_user usercolumn=pam_user.username passwdcolumn=pam_user.password crypt=1 [where=pam_user.is_active="1"]
account required                        pam_permit.so
# /etc/pam.d/common-password

password       sufficient      pam_mysql.so user=webadmin passwd=secret host=192.168.1.1 db=webadmin table=pam_user usercolumn=pam_user.username passwdcolumn=pam_user.password crypt=1 [where=pam_user.is_active="1"]
password       required        pam_unix.so nullok obscure min=4 max=8 md5 try_first_pass
# /etc/pam.d/common-session

session required    pam_mkhomedir.so skel=/etc/skel/ umask=0022
session required        pam_unix.so

That's all! Now things like id <some_mysql_user> should work.

P.S.
I had to do /etc/init.d/sshd reload to make it work for new mysql users.

Pgcluster: error on balancer

I'm currently playing with Pgcluster 1.5.0rc9, which is a multi-master and synchronous replication system that supports load balancing for PostgreSQL. This version is based on PostgreSQL 8.1.4. There's no much documentation i could found, but everything was going smoothly untill i tried to lunch pglb. This daemon must be run on Pgcluster load balancer. I took default config pglb.conf and substituted hostnames with the right ones:

#============================================================
#          Load Balance Server configuration file
#-------------------------------------------------------------
# file: pglb.conf
#-------------------------------------------------------------
# This file controls:
#       o which hosts are db cluster server
#       o which port  use connect to db cluster server
#       o how many connections are allowed on each DB server
#============================================================
#-------------------------------------------------------------
# set cluster DB server information
#               o Host_Name : hostname of Cluster DB
#                             -- please write a host name by FQDN or IP address.
#               o Port : connection for postmaster
#               o Max_Connection : maximun number of connection to postmaster
#-------------------------------------------------------------
<Cluster_Server_Info>
    <Host_Name>     n1.example.com  </Host_Name>
    <Port>                 5432                   </Port>
    <Max_Connect>  0                          </Max_Connect>
</Cluster_Server_Info>
<Cluster_Server_Info>
    <Host_Name>     n2.example.com  </Host_Name>
    <Port>                 5432                   </Port>
    <Max_Connect>  0                         </Max_Connect>
</Cluster_Server_Info>
#-------------------------------------------------------------
# set Load Balance server information
#               o Host_Name : The host name of this load balance server.
#                             -- please write a host name by FQDN or IP address.
#               o Backend_Socket_Dir : Unix domain socket path for the backend
#               o Receive_Port : connection from client
#               o Recovery_Port : connection for recovery process
#               o Max_Cluster_Num : maximun number of cluster DB servers
#               o Use_Connection_Pooling : use connection pool [yes/no] 
#-------------------------------------------------------------
<Host_Name> b.example.com </Host_Name>
<Backend_Socket_Dir>    /tmp     </Backend_Socket_Dir>
<Receive_Port>               5432     </Receive_Port>
<Recovery_Port>             6001     </Recovery_Port>
<Max_Cluster_Num>        128      </Max_Cluster_Num>
<Use_Connection_Pooling>  yes     </Use_Connection_Pooling>

Configuration looked ok, but pglb failed to run:

2006-10-25 12:40:47 [5272] DEBUG:PGRset_status_on_cluster_tbl():host:n1.example.com port:5432 max:0 use:0 status2
2006-10-25 12:40:47 [5272] DEBUG:PGRset_status_on_cluster_tbl():host:n2.example.com port:5432 max:0 use:0 status2
2006-10-25 12:40:47 [5272] DEBUG:init_pglb():Child_Tbl size is[0]
2006-10-25 12:40:47 [5272] ERROR:init_pglb():ChildShm shmget() failed. (Invalid argument)

The reason of this was Max_Connect directive. In the config sample it was set to 0. And i thought that 0 in this context might mean unlimited number of connections to this node. But now it seems it meant something different. After i changed 0's to some other positive values, it began to work.

multiple definition of `__i686.get_pc_thunk.bx'

I needed to compile postgresql from source on debian box. I obtained source/dsc/diff files and tried to build debian package:

$ apt-get source postgresql

$ cd postgresql-7.4.7

$ dpkg-buildpackage -rfakeroot -uc -us
And during compilation i got following error:
/usr/lib/libc_nonshared.a(elf-init.oS)(.gnu.linkonce.t.__i686.get_pc_thunk.bx+0x0): In function `__i686.get_pc_thunk.bx':
: multiple definition of `__i686.get_pc_thunk.bx'
../ecpglib/typename.o(.gnu.linkonce.t.__i686.get_pc_thunk.bx+0x0):
/tmp/postgres/postgresql-7.4.7/build-tree/postgresql-7.4.7/src/interfaces/ecpg/ecpglib/typename.c:18: first defined here
collect2: ld returned 1 exit status

In google i found a few links about such errors. It seemed like it was gcc related issue. From some maillist archive/forum:

« After a little research I tried upgrading from gcc-3.4 to gcc-3.5, which is the latest version currently available in debian stable. The error persisted. After that I tried upgrading to gcc-4.0 out of debian testing, which solved the issue. »

I was running gcc 3.3.5 on that box, so i decided to try with gcc 3.4.4, which is also available in Debian Sarge. The outcome was the same. I changed symlink /usr/bin/gcc from /usr/bin/gcc-3.3 to /usr/bin/gcc-3.4 - stil no luck. And after a dozen of failed compilations with different paramenters set/unset i noticed that it uses /usr/bin/i386-linux-gcc as a compiler. Which was a symlink to /usr/bin/i386-linux-gcc-3.3 (which was a symlink to /usr/bin/gcc-3.3)

I changed it to /usr/bin/i386-linux-gcc-3.4 and successfully compiled postgresql :)