Vous n'êtes pas identifié(e).

#1 Re : Général » Démarrage incomplet » 18/02/2011 17:59:03

Non, aucune sauvegardes n'étaient en cours. Le serveur frontal tomcat a planté, et plein de requêtes étaient en cours sur la bd (select/update).
Impossible de nettoyer les requêtes, postgres ne répondait plus. On a alors lancer l'arrêt du service, mais cela ne répondait pas, ensuite, après 15min, on a lancer un kill -15 sur les processus et 15 min ensuite kill -9, car impossible de faire autrement.

#2 Re : Général » Démarrage incomplet » 18/02/2011 16:10:24

Aprés un reindex , le vacuum full analyze se termine proprement.

#3 Re : Général » Démarrage incomplet » 18/02/2011 16:07:19

j'ai récupéré des logs :

Feb 18 07:51:05 icD postgres[26471]: [1-3] 2011-02-18 07:51:05.061 CET;LOCATION:  CreateLockFile, miscinit.c:866
Feb 18 07:51:51 icD postgres[26477]: [1-1] 2011-02-18 07:51:51.376 CET;LOG:  00000: database system was interrupted while in
recovery at 2011-02-17 23:08:37 CET
Feb 18 07:51:51 icD postgres[26477]: [1-2] 2011-02-18 07:51:51.376 CET;HINT:  This probably means that some data is corrupted
and you will have to use the last backup for
Feb 18 07:51:51 icD postgres[26477]: [1-3]  recovery.
Feb 18 07:51:51 icD postgres[26477]: [1-4] 2011-02-18 07:51:51.376 CET;LOCATION:  StartupXLOG, xlog.c:4729
Feb 18 07:51:51 icD postgres[26477]: [2-1] 2011-02-18 07:51:51.376 CET;LOG:  00000: checkpoint record is at 6/4E4B8040
Feb 18 07:51:51 icD postgres[26477]: [2-2] 2011-02-18 07:51:51.376 CET;LOCATION:  StartupXLOG, xlog.c:4826
Feb 18 07:51:51 icD postgres[26477]: [3-1] 2011-02-18 07:51:51.376 CET;LOG:  00000: redo record is at 6/4E4B8040; undo record
is at 0/0; shutdown FALSE
Feb 18 07:51:51 icD postgres[26477]: [3-2] 2011-02-18 07:51:51.376 CET;LOCATION:  StartupXLOG, xlog.c:4853
Feb 18 07:51:51 icD postgres[26477]: [4-1] 2011-02-18 07:51:51.376 CET;LOG:  00000: next transaction ID: 0/10530268; next OID
: 94537
Feb 18 07:51:51 icD postgres[26477]: [4-2] 2011-02-18 07:51:51.377 CET;LOCATION:  StartupXLOG, xlog.c:4857
Feb 18 07:51:51 icD postgres[26477]: [5-1] 2011-02-18 07:51:51.377 CET;LOG:  00000: next MultiXactId: 45; next MultiXactOffse
t: 89
Feb 18 07:51:51 icD postgres[26477]: [5-2] 2011-02-18 07:51:51.377 CET;LOCATION:  StartupXLOG, xlog.c:4860
Feb 18 07:51:51 icD postgres[26477]: [6-1] 2011-02-18 07:51:51.377 CET;LOG:  00000: database system was not properly shut dow
n; automatic recovery in progress
Feb 18 07:51:51 icD postgres[26477]: [6-2] 2011-02-18 07:51:51.377 CET;LOCATION:  StartupXLOG, xlog.c:4927
Feb 18 07:51:51 icD postgres[26477]: [7-1] 2011-02-18 07:51:51.381 CET;LOG:  00000: redo starts at 6/4E4B8090
Feb 18 07:51:51 icD postgres[26477]: [7-2] 2011-02-18 07:51:51.381 CET;LOCATION:  StartupXLOG, xlog.c:4989
Feb 18 07:51:51 icD postgres[26477]: [8-1] 2011-02-18 07:51:51.511 CET;LOG:  00000: record with zero length at 6/4F5B58F0
Feb 18 07:51:51 icD postgres[26477]: [8-2] 2011-02-18 07:51:51.511 CET;LOCATION:  ReadRecord, xlog.c:3069
Feb 18 07:51:51 icD postgres[26477]: [9-1] 2011-02-18 07:51:51.512 CET;LOG:  00000: redo done at 6/4F5B58C0
Feb 18 07:51:51 icD postgres[26477]: [9-2] 2011-02-18 07:51:51.512 CET;LOCATION:  StartupXLOG, xlog.c:5059
Feb 18 08:00:06 icD postgres[26613]: [1-1] 2011-02-18 08:00:06.402 CET;akioFATAL:  57P03: the database system is starting up
Feb 18 08:00:06 icD postgres[26613]: [1-2] 2011-02-18 08:00:06.402 CET;akioLOCATION:  ProcessStartupPacket, postmaster.c:1583
Feb 18 08:00:06 icD postgres[26614]: [1-1] 2011-02-18 08:00:06.529 CET;akioFATAL:  57P03: the database system is starting up
Feb 18 08:00:06 icD postgres[26614]: [1-2] 2011-02-18 08:00:06.529 CET;akioLOCATION:  ProcessStartupPacket, postmaster.c:1583
Feb 18 08:00:06 icD postgres[26617]: [1-1] 2011-02-18 08:00:06.841 CET;akioFATAL:  57P03: the database system is starting up
Feb 18 08:00:06 icD postgres[26617]: [1-2] 2011-02-18 08:00:06.841 CET;akioLOCATION:  ProcessStartupPacket, postmaster.c:1583
Feb 18 08:00:07 icD postgres[26618]: [1-1] 2011-02-18 08:00:07.359 CET;akioFATAL:  57P03: the database system is starting up
Feb 18 08:00:07 icD postgres[26618]: [1-2] 2011-02-18 08:00:07.359 CET;akioLOCATION:  ProcessStartupPacket, postmaster.c:1583
Feb 18 08:00:07 icD postgres[26619]: [1-1] 2011-02-18 08:00:07.469 CET;akioFATAL:  57P03: the database system is starting up
Feb 18 08:00:07 icD postgres[26619]: [1-2] 2011-02-18 08:00:07.469 CET;akioLOCATION:  ProcessStartupPacket, postmaster.c:1583
Feb 18 08:00:07 icD postgres[26620]: [1-1] 2011-02-18 08:00:07.629 CET;akioFATAL:  57P03: the database system is starting up
Feb 18 08:00:07 icD postgres[26620]: [1-2] 2011-02-18 08:00:07.629 CET;akioLOCATION:  ProcessStartupPacket, postmaster.c:1583
Feb 18 08:00:07 icD postgres[26621]: [1-1] 2011-02-18 08:00:07.755 CET;akioFATAL:  57P03: the database system is starting up
Feb 18 08:00:07 icD postgres[26621]: [1-2] 2011-02-18 08:00:07.755 CET;akioLOCATION:  ProcessStartupPacket, postmaster.c:1583
Feb 18 08:00:07 icD postgres[26622]: [1-1] 2011-02-18 08:00:07.777 CET;akioFATAL:  57P03: the database system is starting up
Feb 18 08:00:07 icD postgres[26622]: [1-2] 2011-02-18 08:00:07.777 CET;akioLOCATION:  ProcessStartupPacket, postmaster.c:1583
Feb 18 08:00:08 icD postgres[26623]: [1-1] 2011-02-18 08:00:08.066 CET;akioFATAL:  57P03: the database system is starting up
Feb 18 08:00:08 icD postgres[26623]: [1-2] 2011-02-18 08:00:08.066 CET;akioLOCATION:  ProcessStartupPacket, postmaster.c:1583
Feb 18 08:01:01 icD auditd[2204]: Audit daemon rotating log files
Feb 18 08:01:07 icD postgres[26655]: [1-1] 2011-02-18 08:01:07.476 CET;akioFATAL:  57P03: the database system is starting up
Feb 18 08:01:07 icD postgres[26655]: [1-2] 2011-02-18 08:01:07.476 CET;akioLOCATION:  ProcessStartupPacket, postmaster.c:1583
Feb 18 08:01:07 icD postgres[26656]: [1-1] 2011-02-18 08:01:07.602 CET;akioFATAL:  57P03: the database system is starting up
Feb 18 08:01:07 icD postgres[26656]: [1-2] 2011-02-18 08:01:07.602 CET;akioLOCATION:  ProcessStartupPacket, postmaster.c:1583
Feb 18 08:01:07 icD postgres[26657]: [1-1] 2011-02-18 08:01:07.916 CET;akioFATAL:  57P03: the database system is starting up
Feb 18 08:01:07 icD postgres[26657]: [1-2] 2011-02-18 08:01:07.916 CET;akioLOCATION:  ProcessStartupPacket, postmaster.c:1583
Feb 18 08:01:08 icD postgres[26658]: [1-1] 2011-02-18 08:01:08.427 CET;akioFATAL:  57P03: the database system is starting up
Feb 18 08:01:08 icD postgres[26658]: [1-2] 2011-02-18 08:01:08.427 CET;akioLOCATION:  ProcessStartupPacket, postmaster.c:1583
Feb 18 08:01:08 icD postgres[26659]: [1-1] 2011-02-18 08:01:08.536 CET;akioFATAL:  57P03: the database system is starting up
Feb 18 08:01:08 icD postgres[26659]: [1-2] 2011-02-18 08:01:08.536 CET;akioLOCATION:  ProcessStartupPacket, postmaster.c:1583
Feb 18 08:01:08 icD postgres[26660]: [1-1] 2011-02-18 08:01:08.846 CET;akioFATAL:  57P03: the database system is starting up
Feb 18 08:01:08 icD postgres[26660]: [1-2] 2011-02-18 08:01:08.846 CET;akioLOCATION:  ProcessStartupPacket, postmaster.c:1583
Feb 18 08:15:05 icD postgres[26752]: [1-1] 2011-02-18 08:15:05.854 CET;akioFATAL:  57P03: the database system is starting up
Feb 18 08:15:05 icD postgres[26752]: [1-2] 2011-02-18 08:15:05.854 CET;akioLOCATION:  ProcessStartupPacket, postmaster.c:1583
Feb 18 08:15:06 icD postgres[26753]: [1-1] 2011-02-18 08:15:06.248 CET;akioFATAL:  57P03: the database system is starting up
Feb 18 08:15:06 icD postgres[26753]: [1-2] 2011-02-18 08:15:06.248 CET;akioLOCATION:  ProcessStartupPacket, postmaster.c:1583
Feb 18 08:15:06 icD postgres[26754]: [1-1] 2011-02-18 08:15:06.557 CET;akioFATAL:  57P03: the database system is starting up
Feb 18 08:15:06 icD postgres[26754]: [1-2] 2011-02-18 08:15:06.558 CET;akioLOCATION:  ProcessStartupPacket, postmaster.c:1583
Feb 18 08:15:07 icD postgres[26757]: [1-1] 2011-02-18 08:15:07.075 CET;akioFATAL:  57P03: the database system is starting up
Feb 18 08:15:07 icD postgres[26757]: [1-2] 2011-02-18 08:15:07.075 CET;akioLOCATION:  ProcessStartupPacket, postmaster.c:1583
Feb 18 08:15:07 icD postgres[26758]: [1-1] 2011-02-18 08:15:07.184 CET;akioFATAL:  57P03: the database system is starting up
Feb 18 08:15:07 icD postgres[26758]: [1-2] 2011-02-18 08:15:07.184 CET;akioLOCATION:  ProcessStartupPacket, postmaster.c:1583
Feb 18 08:15:07 icD postgres[26759]: [1-1] 2011-02-18 08:15:07.492 CET;akioFATAL:  57P03: the database system is starting up
Feb 18 08:15:07 icD postgres[26759]: [1-2] 2011-02-18 08:15:07.492 CET;akioLOCATION:  ProcessStartupPacket, postmaster.c:1583
Feb 18 08:15:07 icD postgres[26760]: [1-1] 2011-02-18 08:15:07.683 CET;akioFATAL:  57P03: the database system is starting up
Feb 18 08:15:07 icD postgres[26760]: [1-2] 2011-02-18 08:15:07.683 CET;akioLOCATION:  ProcessStartupPacket, postmaster.c:1583
Feb 18 08:15:07 icD postgres[26761]: [1-1] 2011-02-18 08:15:07.815 CET;akioFATAL:  57P03: the database system is starting up
Feb 18 08:15:07 icD postgres[26761]: [1-2] 2011-02-18 08:15:07.815 CET;akioLOCATION:  ProcessStartupPacket, postmaster.c:1583
Feb 18 08:15:08 icD postgres[26762]: [1-1] 2011-02-18 08:15:08.124 CET;akioFATAL:  57P03: the database system is starting up
Feb 18 08:15:08 icD postgres[26762]: [1-2] 2011-02-18 08:15:08.124 CET;akioLOCATION:  ProcessStartupPacket, postmaster.c:1583
Feb 18 08:20:05 icD postgres[26809]: [1-1] 2011-02-18 08:20:05.052 CET;akioFATAL:  57P03: the database system is starting up
...

#4 Re : Général » Démarrage incomplet » 18/02/2011 16:03:59

Voici la liste :
[root@icD DB]# ll /opt/akio/db/aic-bnp/pg_xlog/
total 114836
-rw------- 1 postgres postgres      252 fév 11 18:39 00000001000000040000007B.00000020.backup
-rw------- 1 postgres postgres 16777216 fév 18 12:38 000000010000000600000051
-rw------- 1 postgres postgres 16777216 fév 18 12:53 000000010000000600000052
-rw------- 1 postgres postgres 16777216 fév 18 13:08 000000010000000600000053
-rw------- 1 postgres postgres 16777216 fév 18 13:23 000000010000000600000054
-rw------- 1 postgres postgres 16777216 fév 18 13:38 000000010000000600000055
-rw------- 1 postgres postgres 16777216 fév 18 13:53 000000010000000600000056
-rw------- 1 postgres postgres 16777216 fév 18 14:08 000000010000000600000057
drwx------ 2 postgres postgres     4096 fév 18 14:08 archive_status
[root@icD DB]#

Oui le dump se fait sans erreur.
je lance un vaccum full analyze et vous dit le résultat...

#5 Re : Général » Démarrage incomplet » 18/02/2011 13:25:50

j'ai réinitialisé les journaux de transactions. la base répond.
Comment trouver la cause de ce problème ?

(j'ai une sauvegarde du fs avant reinit des journaux de transactions).

#6 Re : Général » Démarrage incomplet » 18/02/2011 13:19:17

Je n'ai pas de trace dans les logs ; j'ai monté les niveaux de trace mais sans succès ;
log_destination = 'syslog'              # Valid values are combinations of
log_directory = '/var/akio/log/DB/'             # Directory where log files are written
log_filename = 'postgresql.log' # Log file name pattern.
log_rotation_age = 1d                   # Automatic rotation of logfiles will
log_rotation_size = 1000000                     # Automatic rotation of logfiles will
log_min_messages = debug1               # Values, in order of decreasing detail:
log_error_verbosity = verbose           # terse, default, or verbose messages
log_min_error_statement = debug1        # Values in order of increasing severity:
log_connections = on
log_line_prefix = '%m;%d'
client_min_messages = debug1   

Comment obtenir les traces ?

#7 Général » Démarrage incomplet » 17/02/2011 19:59:29

Akio
Réponses : 13

Bonjour,

Suite à un problème du serveur postgres 8.2.17 sous une RedHat ES 5.5 x64 nous tentons de redémarrer le service sans succès.
les logs indiquent la récupération du journal de transaction est indiqué faites, mais il ne rend pas le service à la suite ;
une connexion en psql indique que le base de données est en cours de démarrage.
L'arrêt est impossible sans killer les processus.

En redémarrant avec la commande strace, on tombe sur un timeout qui boucle :
munmap(0x2aaf209c2000, 4096)            = 0
open("/etc/hosts", O_RDONLY)            = 7
fcntl(7, F_GETFD)                       = 0
fcntl(7, F_SETFD, FD_CLOEXEC)           = 0
fstat(7, {st_mode=S_IFREG|0644, st_size=278, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2aaf209c2000
read(7, "# Do not remove the following li"..., 4096) = 278
close(7)                                = 0
munmap(0x2aaf209c2000, 4096)            = 0
socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 7
bind(7, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
getsockname(7, {sa_family=AF_INET, sin_port=htons(55122), sin_addr=inet_addr("127.0.0.1")}, [10089892752658530320]) = 0
connect(7, {sa_family=AF_INET, sin_port=htons(55122), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
sendto(7, "\307", 1, 0, NULL, 0)        = 1
select(8, [7], NULL, NULL, {0, 500000}) = 1 (in [7], left {0, 500000})
recvfrom(7, "\307", 1, 0, NULL, NULL)   = 1
fcntl(7, F_SETFL, O_RDONLY|O_NONBLOCK)  = 0
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x2aaee99c36c0) = 1913
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
select(6, [3 4 5], NULL, NULL, {60, 0}) = 0 (Timeout)
rt_sigprocmask(SIG_SETMASK, ~[ILL TRAP ABRT BUS FPE SEGV CONT SYS RTMIN RT_1], NULL, 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
select(6, [3 4 5], NULL, NULL, {60, 0}) = 0 (Timeout)
rt_sigprocmask(SIG_SETMASK, ~[ILL TRAP ABRT BUS FPE SEGV CONT SYS RTMIN RT_1], NULL, 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
select(6, [3 4 5], NULL, NULL, {60, 0}) = 0 (Timeout)
rt_sigprocmask(SIG_SETMASK, ~[ILL TRAP ABRT BUS FPE SEGV CONT SYS RTMIN RT_1], NULL, 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
select(6, [3 4 5], NULL, NULL, {60, 0}

Quelle est la raison de ce timeout ?
Comment résoudre la mise en ligne de la base de données ?

Pied de page des forums

Propulsé par FluxBB