Saturday, March 23, 2013

LiveCycle GDS Demystified

* This post discusses database records from the LiveCycle database.   Please use these instructions at your own risk, and please involve Adobe Support prior to doing so if at all possible.

I was recently involved in a situation where the LiveCycle database (specifically the ADEP version) was growing out of control to the point they were running out of room for backups.  The client was using a database based GDS, so we knew where to start looking.

We started by running the following script:

EXEC sp_spaceused 'adep_user.tb_dm_chunk'

This showed us that the GDS table was taking up just over 160 gigs of space.  We started by opening the Health Monitor in the adminui to do a one time job purge.  We then configured this to run on a nightly basis.  This immediately shrunk the database down to 137 gigs.

The next step was to look at processes, so I ran the following query and saw that all of their current processes were short lived:

select COUNT(1) from adep_user.tb_process_instance

The following query breaks down the size in megs of every record in that table.

select top 10000 convert(decimal(19,2), DATALENGTH(c.content) / 1024 / 1024, 2) as megs, c.idx, c.update_time, c.create_time, s.*
from adep_user.tb_dm_chunk c join adep_user.tb_dm_session_reference s on s.documentid = c.documentid
order by megs desc

The above query showed me three things:
  1. The records that have a sessionid starting with audit were all really small, so we're not worried about process recordings at this point.
  2. We had two massive JobManager sessions that were over a year old, one approaching 847 megs and the other at 421 megs, but again, not our immediate problem.
  3. 84% of our records looked to be orphaned as they had a "-1" sessionid.
To validate that these "-1" sessions were in fact the culprit, I ran the following query and found that they totaled just over 135 gigs:

select convert(decimal(19,2), sum(convert(bigint, DATALENGTH(c.content))) / 1024 / 1024, 2) as megs
from adep_user.tb_dm_chunk c join adep_user.tb_dm_session_reference s on s.documentid = c.documentid
where sessionid = '-1'

The default document max inline size was still at the default of 65,536 bytes, which I believe is the default, so I figured these were all coming from short lived processes.  To confirm this, I found 36038 was the max id from the session reference table by running this query:

select MAX(id) from adep_user.tb_dm_session_reference

To double check that more references didn't exist somewhere else in the database, I went ahead and ran some of the queries found at http://sqlserver2000.databases.aspfaq.com/how-do-i-get-a-list-of-sql-server-tables-and-their-row-counts.html.  It showed me that there were no other tables that had nearly enough records to hold references to these document instances.  I also ran some sys queries to ensure that documentid wasn't used anywhere else.

After a bit more digging I found that these records exist in the tb_dm_deletion, tb_dm_chunk, and tb_dm_session_reference tables:

select d.*, s.*
from adep_user.tb_dm_deletion d
join adep_user.tb_dm_chunk c on c.documentid = d.documentid
join adep_user.tb_dm_session_reference s on s.documentid = c.documentid
where s.sessionid = '-1'

The above query showed me a JobManager type of session in the tb_dm_deletion table, so I knew they had to be coming from the watched folder that kicks off our process calls.  I monitored the following query while the next batch was running and noticed that more "-1" sessions were showing up alongside sessions that had "/liveTemporaries/" as the sessionid.  These "/liveTemporaries/" slowly disappeared, meaning that they were being cleaned from the GDS properly.

Our chunk table grew by the same number of files that were generated from the watched folder, and the sizes matched when I ran the following query:

select convert(decimal(19,2), DATALENGTH(content)) / 1024 as megs from adep_user.tb_dm_chunk where documentid in (select documentid from adep_user.tb_dm_session_reference where id > 36038)

This was our culprit, something about how we were exporting/saving/generating these output files were preventing them from being cleaned up.  After a bit more research, I found that a DSC was being used for compressing and uncompressing files that we were outputting (http://www.adobe.com/devnet/livecycle/articles/compressing-depressing-custom-dsc.html).

I took a very close look at this code and realized that the ZipDocuments method might be creating an extra artifact that wasn't getting cleaned up because the zippedDoc is never marked for disposition or transaction bound.  I'm hoping that adding zippedDoc.setTransactionBound(true); to the bottom of that function will ensure that no more documents are orphaned in the GDS, but that has yet to be confirmed.

We now had to clean up the bad records that were there orphaned in the GDS.  I highly suggest you use this with caution and include Adobe Support before running this script.

create table #badchunks ( documentid nvarchar(52), id numeric(19,0))

insert into #badchunks ( documentid, id )
select c.documentid, s.id
from adep_user.tb_dm_chunk c join adep_user.tb_dm_session_reference s on s.documentid = c.documentid
where sessionid = '-1'

delete c
from adep_user.tb_dm_chunk c join #badchunks bc on bc.documentid = c.documentid

delete s
from adep_user.tb_dm_session_reference s join #badchunks bc on bc.id = s.id

delete d
from adep_user.tb_dm_deletion d join #badchunks bc on bc.documentid = d.documentid

I monitored the progress of this script by running the following query.  The with(readuncommitted) allows for dirty reads of the table.

select COUNT(1) from adep_user.tb_dm_chunk with(readuncommitted)

Removing approximately 3,500 records using this script took about 25 minutes.  At this point we shrank the size of this table down to just under 3 gigs, so the remaining entries are most likely old recordings that we could also look to purge if needed.

SQL Server doesn't automatically relinquish space so the final step is to manually shrink the database.  Using SQL Server Management Studio, I was able to right click on the database, select tasks, and shrink database.  It shows me that the database is 95% free space so we'll go ahead and shrink it.

Relevant Links:

Wednesday, March 06, 2013

Configuring LiveCycle to use AD

We recently stood up a LiveCycle ES3 instance and had to tie it to a new Active Directory installation on Windows Server 2008 R2.  The following are the settings that we use to configure the users and groups.

Directory Users
---------------------------------------
Unique Identifier: objectGUID
Base DN: DC=something,DC=ext
Search Filter: (&(objectcategory=user))
Description: description
Full Name: displayName
Login ID: sAMAccountName
Last Name: sn
Given Name: givenName
Initials: initials
Business Calendar: l
Modify Timestamp: modifyTimeStamp
Organization: company
Primary Email: mail
Secondary Email: proxyAddress
Telephone: telephoneNumber

Directory Groups
---------------------------------------
Unique Identifier: objectGUID
Base DN: DC=something,DC=ext
Search Filter: (objectcategory=group)
Description: description
Full Name: cn
Member DN: member
Member Unique Identifier: objectguid
Organization: company
Primary Email: mail
Secondary Email: proxyAddresses
Modify Timestamp: modifyTimeStamp

The following are some relavent links: