<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://sokwe.janegoodall.org/w/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=IanGilby</id>
	<title>sokwedb - User contributions [en]</title>
	<link rel="self" type="application/atom+xml" href="https://sokwe.janegoodall.org/w/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=IanGilby"/>
	<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/wiki/Special:Contributions/IanGilby"/>
	<updated>2026-06-15T11:20:59Z</updated>
	<subtitle>User contributions</subtitle>
	<generator>MediaWiki 1.35.6</generator>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=593</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=593"/>
		<updated>2026-05-19T19:27:53Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#49) There are follow_arrivals where females that are too old have a cycle code of U */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Talk to Karl about kk_P0 and kk_P1&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
SN1 fixed in MS Access ICG 5/19/2026&lt;br /&gt;
&lt;br /&gt;
Confirm CT and SG were with CA on 10/23/1980 from brec swahili&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
5/19/2026: MGFs have a dummy birthdate which is why the age is being flagged. The rest are what the observer recorded, so let them in.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 8c05232.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#65) There are AGGRESSION_EVENT rows that have &amp;quot;YES&amp;quot; or a space as a ae_bad_observeration_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 3,079 rows where the ae_bad_observation_flag is a space, and 1 row where it is &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_bad_observation_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_bad_observation_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;() as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#66) There are AGGRESSION_EVENT rows that have a space as a ae_decided_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,041 rows where the ae_decided_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_decided_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_decided_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#67) There are AGGRESSION_EVENT rows that have a space or an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; as a ae_multiple_aggressor_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,397 rows where the ae_multiple_aggressor_flag is a space and 2 rows where the value is an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_aggressor_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_aggressor_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#68) There are AGGRESSION_EVENT rows that have a space as a ae_multiple_recipient_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,528 rows where the ae_multiple_recipient_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_recipient_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_recipient_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#69) There are duplicate year*community records in AGGRESSION_EVENT_LOG ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 11 duplicate pairs of year*community records in AGGRESSION_EVENT_LOG. There can be, at most, one row per-community, per-year. The duplicates each have a `b_rec_english` value of `ALL` or `F-F (ALL); F-M (ALL); M-M (ALL)`.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event_log&lt;br /&gt;
JOIN (&lt;br /&gt;
  SELECT&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  FROM&lt;br /&gt;
    clean.aggression_event_log&lt;br /&gt;
  GROUP BY&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  HAVING count(*) &amp;gt; 1&lt;br /&gt;
  ) AS dups ON (&lt;br /&gt;
  dups.year          = aggression_event_log.year&lt;br /&gt;
  AND dups.community = aggression_event_log.community&lt;br /&gt;
)&lt;br /&gt;
order by &lt;br /&gt;
  aggression_event_log.year,&lt;br /&gt;
  aggression_event_log.community ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#70) There are AGGRESSION_EVENT rows that do not have a matching FOLLOW for the same focal ID and date. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 1494 records where clean.aggression_event does not have a matching row in clean.follow for the same focal ID and date.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH problem_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ae.*,&lt;br /&gt;
        EXISTS (&lt;br /&gt;
            SELECT 1&lt;br /&gt;
            FROM clean.follow f_id&lt;br /&gt;
            WHERE f_id.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
        ) AS focal_id_exists_in_follow&lt;br /&gt;
    FROM clean.aggression_event ae&lt;br /&gt;
    WHERE NOT EXISTS (&lt;br /&gt;
        SELECT 1&lt;br /&gt;
        FROM clean.follow f&lt;br /&gt;
        WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
          AND f.fol_date = ae.ae_date&lt;br /&gt;
    )&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_time,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_b_aggressor_id,&lt;br /&gt;
    pr.ae_b_recipient_id,&lt;br /&gt;
    pr.ae_source,&lt;br /&gt;
    pr.ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
        WHEN pr.focal_id_exists_in_follow THEN &amp;#039;missing_follow_on_same_date&amp;#039;&lt;br /&gt;
        ELSE &amp;#039;focal_id_not_found_in_follow&amp;#039;&lt;br /&gt;
    END AS issue_type&lt;br /&gt;
FROM problem_rows pr&lt;br /&gt;
ORDER BY&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#71) There are AGGRESSION_EVENT animal ids that are not reflected in among FOLLOW animal ids. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 16 unique(!) aggression_event.ae_fol_b_focal_id records that are not reflected among follow.fol_b_animid values.&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
All of the problematic records are reflected in the rows excluded as part of Problem #70. As such, there is not a separate exclusion clause in the conversion process for these data. As a correlary, addressing all issues in Problem #70 would address all Problem #71 infractions well.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fol_b_focal_id NOT IN (&lt;br /&gt;
    SELECT DISTINCT fol_b_animid&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE fol_b_animid IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
GROUP BY ae_fol_b_focal_id&lt;br /&gt;
ORDER BY&lt;br /&gt;
  row_count DESC,&lt;br /&gt;
  ae_fol_b_focal_id&lt;br /&gt;
;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#72) There are AGGRESSION_EVENT recipient certainty flags other than `Y` or `N` (required). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 28,322 AGGRESSION_EVENT rows that have a ae_recipient_certainty_flag other than `Y` or `N` as required.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
ae.*,&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;) ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag,&lt;br /&gt;
COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;)&lt;br /&gt;
GROUP BY COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;)&lt;br /&gt;
ORDER BY row_count DESC, raw_flag;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039; &amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;N&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;NO&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
** &amp;#039;Y%&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#73) There are AGGRESSION_EVENT aggression_event.ae_b_recipient_id values that are NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5,820 AGGRESSION_EVENT rows for which the aggression_event.ae_b_recipient_id is NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_recipient_certainty_flag,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NULL&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#74) There are AGGRESSION_EVENT event flags (multiple columns) that have values other than X or NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
Among AGGRESSION_EVENT flags (ae_bad_observation_flag, ae_bristle_flag, ae_chase_flag, ae_contact_flag, ae_contact_flag, ae_decided_flag, ae_display_flag, ae_multiple_aggressor_flag, ae_multiple_recipient_flag, ae_vocal_flag, ae_vocal_flag) there are 2,864 rows that have a value other than `X` or NULL (required) for one or more of the flags.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
  ae.ae_date,&lt;br /&gt;
  ae.ae_time,&lt;br /&gt;
  ae.ae_fol_b_focal_id,&lt;br /&gt;
  ae.ae_b_aggressor_id,&lt;br /&gt;
  ae.ae_b_recipient_id,&lt;br /&gt;
  ae.ae_decided_flag,&lt;br /&gt;
  ae.ae_multiple_aggressor_flag,&lt;br /&gt;
  ae.ae_multiple_recipient_flag,&lt;br /&gt;
  ae.ae_bad_observation_flag,&lt;br /&gt;
  ae.ae_bristle_flag,&lt;br /&gt;
  ae.ae_display_flag,&lt;br /&gt;
  ae.ae_chase_flag,&lt;br /&gt;
  ae.ae_contact_flag,&lt;br /&gt;
  ae.ae_vocal_flag,&lt;br /&gt;
  ae.ae_source,&lt;br /&gt;
  ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
AND (&lt;br /&gt;
    (ae.ae_decided_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_decided_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_aggressor_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_aggressor_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_recipient_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_recipient_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bad_observation_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bad_observation_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bristle_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bristle_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_display_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_display_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_chase_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_chase_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_contact_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_contact_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_vocal_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_vocal_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
)&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH base AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
),&lt;br /&gt;
flag_values AS (&lt;br /&gt;
  SELECT &amp;#039;ae_decided_flag&amp;#039; AS flag_name, COALESCE(ae_decided_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_value FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_aggressor_flag&amp;#039;, COALESCE(ae_multiple_aggressor_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_recipient_flag&amp;#039;, COALESCE(ae_multiple_recipient_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bad_observation_flag&amp;#039;, COALESCE(ae_bad_observation_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bristle_flag&amp;#039;, COALESCE(ae_bristle_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_display_flag&amp;#039;, COALESCE(ae_display_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_chase_flag&amp;#039;, COALESCE(ae_chase_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_contact_flag&amp;#039;, COALESCE(ae_contact_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_vocal_flag&amp;#039;, COALESCE(ae_vocal_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  flag_name,&lt;br /&gt;
  raw_value,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM flag_values&lt;br /&gt;
WHERE raw_value &amp;lt;&amp;gt; &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;&lt;br /&gt;
  AND UPPER(BTRIM(raw_value)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;&lt;br /&gt;
GROUP BY flag_name, raw_value&lt;br /&gt;
ORDER BY flag_name, row_count DESC, raw_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* clean schema conversions (mimic patterns in Access):&lt;br /&gt;
** `?` to &amp;#039; &amp;#039;&lt;br /&gt;
** `Y%` to `X`&lt;br /&gt;
** `X%` to `X`&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039;&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
== * (#75) There are AGGRESSION_EVENT records where ae_extracted_by values are not in the people table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 28,513 AGGRESSION_EVENT records where ae_extracted_by does not match a person in the PEOPLE table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  s.ae_date,&lt;br /&gt;
  s.ae_time,&lt;br /&gt;
  s.ae_fol_b_focal_id,&lt;br /&gt;
  s.ae_b_aggressor_id,&lt;br /&gt;
  s.ae_b_recipient_id,&lt;br /&gt;
  s.ae_extracted_by,&lt;br /&gt;
  s.ae_source,&lt;br /&gt;
  s.ae_full_description,&lt;br /&gt;
  s.ae_comments&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
ORDER BY s.ae_date, s.ae_fol_b_focal_id, s.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;)) AS raw_extractedby,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, raw_extractedby;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#76) There are AGGRESSION_EVENT rows where the Actor/Actee are not in the biography table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9,439 combined (but see following!) rows where the ae_b_aggressor_id and/or ae_b_recipient_id is not in the BIOGRAPHY table.&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
* The row count is inflated by a CROSS LATERAL JOIN, which yields the number of combined, pivoted records for both ae_b_aggressor_id and ae_b_recipient, not the actual number of confounding AGGRESSION_EVENT rows.&lt;br /&gt;
* The queries currently exclude ae_b_recipient_id values that are NULL, which seems a related but separte problem; the number of rows jumps to 10,357 if ae_b_recipient_id = NULL are included.&lt;br /&gt;
* How are we treating ae_b_recipient_id values such as `group`, `males`, `females`, etc.?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- 2) Detailed rows against clean.biography&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    s.ae_date,&lt;br /&gt;
    s.ae_time,&lt;br /&gt;
    s.ae_fol_b_focal_id,&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant,&lt;br /&gt;
    s.ae_b_aggressor_id,&lt;br /&gt;
    s.ae_b_recipient_id,&lt;br /&gt;
    s.ae_source,&lt;br /&gt;
    s.ae_full_description,&lt;br /&gt;
    s.ae_comments&lt;br /&gt;
FROM scoped s&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, BTRIM(COALESCE(s.ae_b_aggressor_id, &amp;#039;&amp;#039;))),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, BTRIM(COALESCE(s.ae_b_recipient_id, &amp;#039;&amp;#039;)))&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
WHERE v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY s.ae_date, s.ae_fol_b_focal_id, s.ae_time, v.role_name;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- 1) Summary against clean.biography&lt;br /&gt;
WITH role_values AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      &amp;#039;Actor&amp;#039;::text AS role_name,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;)) AS participant&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
&lt;br /&gt;
  UNION ALL&lt;br /&gt;
&lt;br /&gt;
  SELECT&lt;br /&gt;
      &amp;#039;Actee&amp;#039;::text AS role_name,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;)) AS participant&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    rv.role_name,&lt;br /&gt;
    rv.participant,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM role_values rv&lt;br /&gt;
WHERE rv.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = rv.participant&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY rv.role_name, rv.participant&lt;br /&gt;
ORDER BY rv.role_name, row_count DESC, rv.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#77) There are AGGRESSION_EVENT rows where ae_recipient_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9,102 records where ae_recipient_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_recipient_behavior IS NULL&lt;br /&gt;
  OR ae_recipient_behavior = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#78) There are AGGRESSION_EVENT rows where ae_full_description is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,718 records where ae_full_description is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_full_description IS NULL&lt;br /&gt;
  OR ae_full_description = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=592</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=592"/>
		<updated>2026-05-19T19:27:05Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* Solution */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Talk to Karl about kk_P0 and kk_P1&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
SN1 fixed in MS Access ICG 5/19/2026&lt;br /&gt;
&lt;br /&gt;
Confirm CT and SG were with CA on 10/23/1980 from brec swahili&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
5/19/2026: MGFs have a dummy birthdate which is why the age is being flagged. The rest are what the observer recorded, so let them in.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 8c05232.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#65) There are AGGRESSION_EVENT rows that have &amp;quot;YES&amp;quot; or a space as a ae_bad_observeration_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 3,079 rows where the ae_bad_observation_flag is a space, and 1 row where it is &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_bad_observation_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_bad_observation_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;() as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#66) There are AGGRESSION_EVENT rows that have a space as a ae_decided_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,041 rows where the ae_decided_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_decided_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_decided_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#67) There are AGGRESSION_EVENT rows that have a space or an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; as a ae_multiple_aggressor_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,397 rows where the ae_multiple_aggressor_flag is a space and 2 rows where the value is an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_aggressor_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_aggressor_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#68) There are AGGRESSION_EVENT rows that have a space as a ae_multiple_recipient_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,528 rows where the ae_multiple_recipient_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_recipient_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_recipient_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#69) There are duplicate year*community records in AGGRESSION_EVENT_LOG ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 11 duplicate pairs of year*community records in AGGRESSION_EVENT_LOG. There can be, at most, one row per-community, per-year. The duplicates each have a `b_rec_english` value of `ALL` or `F-F (ALL); F-M (ALL); M-M (ALL)`.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event_log&lt;br /&gt;
JOIN (&lt;br /&gt;
  SELECT&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  FROM&lt;br /&gt;
    clean.aggression_event_log&lt;br /&gt;
  GROUP BY&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  HAVING count(*) &amp;gt; 1&lt;br /&gt;
  ) AS dups ON (&lt;br /&gt;
  dups.year          = aggression_event_log.year&lt;br /&gt;
  AND dups.community = aggression_event_log.community&lt;br /&gt;
)&lt;br /&gt;
order by &lt;br /&gt;
  aggression_event_log.year,&lt;br /&gt;
  aggression_event_log.community ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#70) There are AGGRESSION_EVENT rows that do not have a matching FOLLOW for the same focal ID and date. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 1494 records where clean.aggression_event does not have a matching row in clean.follow for the same focal ID and date.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH problem_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ae.*,&lt;br /&gt;
        EXISTS (&lt;br /&gt;
            SELECT 1&lt;br /&gt;
            FROM clean.follow f_id&lt;br /&gt;
            WHERE f_id.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
        ) AS focal_id_exists_in_follow&lt;br /&gt;
    FROM clean.aggression_event ae&lt;br /&gt;
    WHERE NOT EXISTS (&lt;br /&gt;
        SELECT 1&lt;br /&gt;
        FROM clean.follow f&lt;br /&gt;
        WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
          AND f.fol_date = ae.ae_date&lt;br /&gt;
    )&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_time,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_b_aggressor_id,&lt;br /&gt;
    pr.ae_b_recipient_id,&lt;br /&gt;
    pr.ae_source,&lt;br /&gt;
    pr.ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
        WHEN pr.focal_id_exists_in_follow THEN &amp;#039;missing_follow_on_same_date&amp;#039;&lt;br /&gt;
        ELSE &amp;#039;focal_id_not_found_in_follow&amp;#039;&lt;br /&gt;
    END AS issue_type&lt;br /&gt;
FROM problem_rows pr&lt;br /&gt;
ORDER BY&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#71) There are AGGRESSION_EVENT animal ids that are not reflected in among FOLLOW animal ids. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 16 unique(!) aggression_event.ae_fol_b_focal_id records that are not reflected among follow.fol_b_animid values.&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
All of the problematic records are reflected in the rows excluded as part of Problem #70. As such, there is not a separate exclusion clause in the conversion process for these data. As a correlary, addressing all issues in Problem #70 would address all Problem #71 infractions well.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fol_b_focal_id NOT IN (&lt;br /&gt;
    SELECT DISTINCT fol_b_animid&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE fol_b_animid IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
GROUP BY ae_fol_b_focal_id&lt;br /&gt;
ORDER BY&lt;br /&gt;
  row_count DESC,&lt;br /&gt;
  ae_fol_b_focal_id&lt;br /&gt;
;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#72) There are AGGRESSION_EVENT recipient certainty flags other than `Y` or `N` (required). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 28,322 AGGRESSION_EVENT rows that have a ae_recipient_certainty_flag other than `Y` or `N` as required.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
ae.*,&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;) ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag,&lt;br /&gt;
COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;)&lt;br /&gt;
GROUP BY COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;)&lt;br /&gt;
ORDER BY row_count DESC, raw_flag;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039; &amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;N&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;NO&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
** &amp;#039;Y%&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#73) There are AGGRESSION_EVENT aggression_event.ae_b_recipient_id values that are NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5,820 AGGRESSION_EVENT rows for which the aggression_event.ae_b_recipient_id is NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_recipient_certainty_flag,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NULL&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#74) There are AGGRESSION_EVENT event flags (multiple columns) that have values other than X or NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
Among AGGRESSION_EVENT flags (ae_bad_observation_flag, ae_bristle_flag, ae_chase_flag, ae_contact_flag, ae_contact_flag, ae_decided_flag, ae_display_flag, ae_multiple_aggressor_flag, ae_multiple_recipient_flag, ae_vocal_flag, ae_vocal_flag) there are 2,864 rows that have a value other than `X` or NULL (required) for one or more of the flags.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
  ae.ae_date,&lt;br /&gt;
  ae.ae_time,&lt;br /&gt;
  ae.ae_fol_b_focal_id,&lt;br /&gt;
  ae.ae_b_aggressor_id,&lt;br /&gt;
  ae.ae_b_recipient_id,&lt;br /&gt;
  ae.ae_decided_flag,&lt;br /&gt;
  ae.ae_multiple_aggressor_flag,&lt;br /&gt;
  ae.ae_multiple_recipient_flag,&lt;br /&gt;
  ae.ae_bad_observation_flag,&lt;br /&gt;
  ae.ae_bristle_flag,&lt;br /&gt;
  ae.ae_display_flag,&lt;br /&gt;
  ae.ae_chase_flag,&lt;br /&gt;
  ae.ae_contact_flag,&lt;br /&gt;
  ae.ae_vocal_flag,&lt;br /&gt;
  ae.ae_source,&lt;br /&gt;
  ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
AND (&lt;br /&gt;
    (ae.ae_decided_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_decided_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_aggressor_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_aggressor_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_recipient_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_recipient_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bad_observation_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bad_observation_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bristle_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bristle_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_display_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_display_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_chase_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_chase_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_contact_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_contact_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_vocal_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_vocal_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
)&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH base AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
),&lt;br /&gt;
flag_values AS (&lt;br /&gt;
  SELECT &amp;#039;ae_decided_flag&amp;#039; AS flag_name, COALESCE(ae_decided_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_value FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_aggressor_flag&amp;#039;, COALESCE(ae_multiple_aggressor_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_recipient_flag&amp;#039;, COALESCE(ae_multiple_recipient_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bad_observation_flag&amp;#039;, COALESCE(ae_bad_observation_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bristle_flag&amp;#039;, COALESCE(ae_bristle_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_display_flag&amp;#039;, COALESCE(ae_display_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_chase_flag&amp;#039;, COALESCE(ae_chase_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_contact_flag&amp;#039;, COALESCE(ae_contact_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_vocal_flag&amp;#039;, COALESCE(ae_vocal_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  flag_name,&lt;br /&gt;
  raw_value,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM flag_values&lt;br /&gt;
WHERE raw_value &amp;lt;&amp;gt; &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;&lt;br /&gt;
  AND UPPER(BTRIM(raw_value)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;&lt;br /&gt;
GROUP BY flag_name, raw_value&lt;br /&gt;
ORDER BY flag_name, row_count DESC, raw_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* clean schema conversions (mimic patterns in Access):&lt;br /&gt;
** `?` to &amp;#039; &amp;#039;&lt;br /&gt;
** `Y%` to `X`&lt;br /&gt;
** `X%` to `X`&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039;&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
== * (#75) There are AGGRESSION_EVENT records where ae_extracted_by values are not in the people table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 28,513 AGGRESSION_EVENT records where ae_extracted_by does not match a person in the PEOPLE table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  s.ae_date,&lt;br /&gt;
  s.ae_time,&lt;br /&gt;
  s.ae_fol_b_focal_id,&lt;br /&gt;
  s.ae_b_aggressor_id,&lt;br /&gt;
  s.ae_b_recipient_id,&lt;br /&gt;
  s.ae_extracted_by,&lt;br /&gt;
  s.ae_source,&lt;br /&gt;
  s.ae_full_description,&lt;br /&gt;
  s.ae_comments&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
ORDER BY s.ae_date, s.ae_fol_b_focal_id, s.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;)) AS raw_extractedby,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, raw_extractedby;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#76) There are AGGRESSION_EVENT rows where the Actor/Actee are not in the biography table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9,439 combined (but see following!) rows where the ae_b_aggressor_id and/or ae_b_recipient_id is not in the BIOGRAPHY table.&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
* The row count is inflated by a CROSS LATERAL JOIN, which yields the number of combined, pivoted records for both ae_b_aggressor_id and ae_b_recipient, not the actual number of confounding AGGRESSION_EVENT rows.&lt;br /&gt;
* The queries currently exclude ae_b_recipient_id values that are NULL, which seems a related but separte problem; the number of rows jumps to 10,357 if ae_b_recipient_id = NULL are included.&lt;br /&gt;
* How are we treating ae_b_recipient_id values such as `group`, `males`, `females`, etc.?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- 2) Detailed rows against clean.biography&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    s.ae_date,&lt;br /&gt;
    s.ae_time,&lt;br /&gt;
    s.ae_fol_b_focal_id,&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant,&lt;br /&gt;
    s.ae_b_aggressor_id,&lt;br /&gt;
    s.ae_b_recipient_id,&lt;br /&gt;
    s.ae_source,&lt;br /&gt;
    s.ae_full_description,&lt;br /&gt;
    s.ae_comments&lt;br /&gt;
FROM scoped s&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, BTRIM(COALESCE(s.ae_b_aggressor_id, &amp;#039;&amp;#039;))),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, BTRIM(COALESCE(s.ae_b_recipient_id, &amp;#039;&amp;#039;)))&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
WHERE v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY s.ae_date, s.ae_fol_b_focal_id, s.ae_time, v.role_name;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- 1) Summary against clean.biography&lt;br /&gt;
WITH role_values AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      &amp;#039;Actor&amp;#039;::text AS role_name,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;)) AS participant&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
&lt;br /&gt;
  UNION ALL&lt;br /&gt;
&lt;br /&gt;
  SELECT&lt;br /&gt;
      &amp;#039;Actee&amp;#039;::text AS role_name,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;)) AS participant&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    rv.role_name,&lt;br /&gt;
    rv.participant,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM role_values rv&lt;br /&gt;
WHERE rv.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = rv.participant&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY rv.role_name, rv.participant&lt;br /&gt;
ORDER BY rv.role_name, row_count DESC, rv.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#77) There are AGGRESSION_EVENT rows where ae_recipient_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9,102 records where ae_recipient_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_recipient_behavior IS NULL&lt;br /&gt;
  OR ae_recipient_behavior = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#78) There are AGGRESSION_EVENT rows where ae_full_description is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,718 records where ae_full_description is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_full_description IS NULL&lt;br /&gt;
  OR ae_full_description = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=591</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=591"/>
		<updated>2026-05-19T19:16:37Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* Solution */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Talk to Karl about kk_P0 and kk_P1&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
SN1 fixed in MS Access ICG 5/19/2026&lt;br /&gt;
&lt;br /&gt;
Confirm CT and SG were with CA on 10/23/1980 from brec swahili&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 8c05232.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#65) There are AGGRESSION_EVENT rows that have &amp;quot;YES&amp;quot; or a space as a ae_bad_observeration_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 3,079 rows where the ae_bad_observation_flag is a space, and 1 row where it is &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_bad_observation_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_bad_observation_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;() as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#66) There are AGGRESSION_EVENT rows that have a space as a ae_decided_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,041 rows where the ae_decided_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_decided_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_decided_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#67) There are AGGRESSION_EVENT rows that have a space or an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; as a ae_multiple_aggressor_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,397 rows where the ae_multiple_aggressor_flag is a space and 2 rows where the value is an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_aggressor_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_aggressor_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#68) There are AGGRESSION_EVENT rows that have a space as a ae_multiple_recipient_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,528 rows where the ae_multiple_recipient_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_recipient_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_recipient_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#69) There are duplicate year*community records in AGGRESSION_EVENT_LOG ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 11 duplicate pairs of year*community records in AGGRESSION_EVENT_LOG. There can be, at most, one row per-community, per-year. The duplicates each have a `b_rec_english` value of `ALL` or `F-F (ALL); F-M (ALL); M-M (ALL)`.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event_log&lt;br /&gt;
JOIN (&lt;br /&gt;
  SELECT&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  FROM&lt;br /&gt;
    clean.aggression_event_log&lt;br /&gt;
  GROUP BY&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  HAVING count(*) &amp;gt; 1&lt;br /&gt;
  ) AS dups ON (&lt;br /&gt;
  dups.year          = aggression_event_log.year&lt;br /&gt;
  AND dups.community = aggression_event_log.community&lt;br /&gt;
)&lt;br /&gt;
order by &lt;br /&gt;
  aggression_event_log.year,&lt;br /&gt;
  aggression_event_log.community ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#70) There are AGGRESSION_EVENT rows that do not have a matching FOLLOW for the same focal ID and date. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 1494 records where clean.aggression_event does not have a matching row in clean.follow for the same focal ID and date.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH problem_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ae.*,&lt;br /&gt;
        EXISTS (&lt;br /&gt;
            SELECT 1&lt;br /&gt;
            FROM clean.follow f_id&lt;br /&gt;
            WHERE f_id.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
        ) AS focal_id_exists_in_follow&lt;br /&gt;
    FROM clean.aggression_event ae&lt;br /&gt;
    WHERE NOT EXISTS (&lt;br /&gt;
        SELECT 1&lt;br /&gt;
        FROM clean.follow f&lt;br /&gt;
        WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
          AND f.fol_date = ae.ae_date&lt;br /&gt;
    )&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_time,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_b_aggressor_id,&lt;br /&gt;
    pr.ae_b_recipient_id,&lt;br /&gt;
    pr.ae_source,&lt;br /&gt;
    pr.ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
        WHEN pr.focal_id_exists_in_follow THEN &amp;#039;missing_follow_on_same_date&amp;#039;&lt;br /&gt;
        ELSE &amp;#039;focal_id_not_found_in_follow&amp;#039;&lt;br /&gt;
    END AS issue_type&lt;br /&gt;
FROM problem_rows pr&lt;br /&gt;
ORDER BY&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#71) There are AGGRESSION_EVENT animal ids that are not reflected in among FOLLOW animal ids. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 16 unique(!) aggression_event.ae_fol_b_focal_id records that are not reflected among follow.fol_b_animid values.&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
All of the problematic records are reflected in the rows excluded as part of Problem #70. As such, there is not a separate exclusion clause in the conversion process for these data. As a correlary, addressing all issues in Problem #70 would address all Problem #71 infractions well.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fol_b_focal_id NOT IN (&lt;br /&gt;
    SELECT DISTINCT fol_b_animid&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE fol_b_animid IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
GROUP BY ae_fol_b_focal_id&lt;br /&gt;
ORDER BY&lt;br /&gt;
  row_count DESC,&lt;br /&gt;
  ae_fol_b_focal_id&lt;br /&gt;
;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#72) There are AGGRESSION_EVENT recipient certainty flags other than `Y` or `N` (required). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 28,322 AGGRESSION_EVENT rows that have a ae_recipient_certainty_flag other than `Y` or `N` as required.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
ae.*,&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;) ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag,&lt;br /&gt;
COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;)&lt;br /&gt;
GROUP BY COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;)&lt;br /&gt;
ORDER BY row_count DESC, raw_flag;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039; &amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;N&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;NO&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
** &amp;#039;Y%&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#73) There are AGGRESSION_EVENT aggression_event.ae_b_recipient_id values that are NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5,820 AGGRESSION_EVENT rows for which the aggression_event.ae_b_recipient_id is NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_recipient_certainty_flag,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NULL&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#74) There are AGGRESSION_EVENT event flags (multiple columns) that have values other than X or NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
Among AGGRESSION_EVENT flags (ae_bad_observation_flag, ae_bristle_flag, ae_chase_flag, ae_contact_flag, ae_contact_flag, ae_decided_flag, ae_display_flag, ae_multiple_aggressor_flag, ae_multiple_recipient_flag, ae_vocal_flag, ae_vocal_flag) there are 2,864 rows that have a value other than `X` or NULL (required) for one or more of the flags.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
  ae.ae_date,&lt;br /&gt;
  ae.ae_time,&lt;br /&gt;
  ae.ae_fol_b_focal_id,&lt;br /&gt;
  ae.ae_b_aggressor_id,&lt;br /&gt;
  ae.ae_b_recipient_id,&lt;br /&gt;
  ae.ae_decided_flag,&lt;br /&gt;
  ae.ae_multiple_aggressor_flag,&lt;br /&gt;
  ae.ae_multiple_recipient_flag,&lt;br /&gt;
  ae.ae_bad_observation_flag,&lt;br /&gt;
  ae.ae_bristle_flag,&lt;br /&gt;
  ae.ae_display_flag,&lt;br /&gt;
  ae.ae_chase_flag,&lt;br /&gt;
  ae.ae_contact_flag,&lt;br /&gt;
  ae.ae_vocal_flag,&lt;br /&gt;
  ae.ae_source,&lt;br /&gt;
  ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
AND (&lt;br /&gt;
    (ae.ae_decided_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_decided_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_aggressor_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_aggressor_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_recipient_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_recipient_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bad_observation_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bad_observation_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bristle_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bristle_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_display_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_display_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_chase_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_chase_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_contact_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_contact_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_vocal_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_vocal_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
)&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH base AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
),&lt;br /&gt;
flag_values AS (&lt;br /&gt;
  SELECT &amp;#039;ae_decided_flag&amp;#039; AS flag_name, COALESCE(ae_decided_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_value FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_aggressor_flag&amp;#039;, COALESCE(ae_multiple_aggressor_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_recipient_flag&amp;#039;, COALESCE(ae_multiple_recipient_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bad_observation_flag&amp;#039;, COALESCE(ae_bad_observation_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bristle_flag&amp;#039;, COALESCE(ae_bristle_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_display_flag&amp;#039;, COALESCE(ae_display_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_chase_flag&amp;#039;, COALESCE(ae_chase_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_contact_flag&amp;#039;, COALESCE(ae_contact_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_vocal_flag&amp;#039;, COALESCE(ae_vocal_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  flag_name,&lt;br /&gt;
  raw_value,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM flag_values&lt;br /&gt;
WHERE raw_value &amp;lt;&amp;gt; &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;&lt;br /&gt;
  AND UPPER(BTRIM(raw_value)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;&lt;br /&gt;
GROUP BY flag_name, raw_value&lt;br /&gt;
ORDER BY flag_name, row_count DESC, raw_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* clean schema conversions (mimic patterns in Access):&lt;br /&gt;
** `?` to &amp;#039; &amp;#039;&lt;br /&gt;
** `Y%` to `X`&lt;br /&gt;
** `X%` to `X`&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039;&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
== * (#75) There are AGGRESSION_EVENT records where ae_extracted_by values are not in the people table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 28,513 AGGRESSION_EVENT records where ae_extracted_by does not match a person in the PEOPLE table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  s.ae_date,&lt;br /&gt;
  s.ae_time,&lt;br /&gt;
  s.ae_fol_b_focal_id,&lt;br /&gt;
  s.ae_b_aggressor_id,&lt;br /&gt;
  s.ae_b_recipient_id,&lt;br /&gt;
  s.ae_extracted_by,&lt;br /&gt;
  s.ae_source,&lt;br /&gt;
  s.ae_full_description,&lt;br /&gt;
  s.ae_comments&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
ORDER BY s.ae_date, s.ae_fol_b_focal_id, s.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;)) AS raw_extractedby,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, raw_extractedby;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#76) There are AGGRESSION_EVENT rows where the Actor/Actee are not in the biography table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9,439 combined (but see following!) rows where the ae_b_aggressor_id and/or ae_b_recipient_id is not in the BIOGRAPHY table.&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
* The row count is inflated by a CROSS LATERAL JOIN, which yields the number of combined, pivoted records for both ae_b_aggressor_id and ae_b_recipient, not the actual number of confounding AGGRESSION_EVENT rows.&lt;br /&gt;
* The queries currently exclude ae_b_recipient_id values that are NULL, which seems a related but separte problem; the number of rows jumps to 10,357 if ae_b_recipient_id = NULL are included.&lt;br /&gt;
* How are we treating ae_b_recipient_id values such as `group`, `males`, `females`, etc.?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- 2) Detailed rows against clean.biography&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    s.ae_date,&lt;br /&gt;
    s.ae_time,&lt;br /&gt;
    s.ae_fol_b_focal_id,&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant,&lt;br /&gt;
    s.ae_b_aggressor_id,&lt;br /&gt;
    s.ae_b_recipient_id,&lt;br /&gt;
    s.ae_source,&lt;br /&gt;
    s.ae_full_description,&lt;br /&gt;
    s.ae_comments&lt;br /&gt;
FROM scoped s&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, BTRIM(COALESCE(s.ae_b_aggressor_id, &amp;#039;&amp;#039;))),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, BTRIM(COALESCE(s.ae_b_recipient_id, &amp;#039;&amp;#039;)))&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
WHERE v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY s.ae_date, s.ae_fol_b_focal_id, s.ae_time, v.role_name;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- 1) Summary against clean.biography&lt;br /&gt;
WITH role_values AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      &amp;#039;Actor&amp;#039;::text AS role_name,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;)) AS participant&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
&lt;br /&gt;
  UNION ALL&lt;br /&gt;
&lt;br /&gt;
  SELECT&lt;br /&gt;
      &amp;#039;Actee&amp;#039;::text AS role_name,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;)) AS participant&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    rv.role_name,&lt;br /&gt;
    rv.participant,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM role_values rv&lt;br /&gt;
WHERE rv.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = rv.participant&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY rv.role_name, rv.participant&lt;br /&gt;
ORDER BY rv.role_name, row_count DESC, rv.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#77) There are AGGRESSION_EVENT rows where ae_recipient_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9,102 records where ae_recipient_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_recipient_behavior IS NULL&lt;br /&gt;
  OR ae_recipient_behavior = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#78) There are AGGRESSION_EVENT rows where ae_full_description is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,718 records where ae_full_description is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_full_description IS NULL&lt;br /&gt;
  OR ae_full_description = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=590</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=590"/>
		<updated>2026-05-19T19:14:39Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* Solution */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Talk to Karl about kk_P0 and kk_P1&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
Confirm CT and SG were with CA on 10/23/1980 from brec swahili&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 8c05232.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#65) There are AGGRESSION_EVENT rows that have &amp;quot;YES&amp;quot; or a space as a ae_bad_observeration_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 3,079 rows where the ae_bad_observation_flag is a space, and 1 row where it is &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_bad_observation_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_bad_observation_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;() as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#66) There are AGGRESSION_EVENT rows that have a space as a ae_decided_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,041 rows where the ae_decided_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_decided_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_decided_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#67) There are AGGRESSION_EVENT rows that have a space or an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; as a ae_multiple_aggressor_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,397 rows where the ae_multiple_aggressor_flag is a space and 2 rows where the value is an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_aggressor_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_aggressor_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#68) There are AGGRESSION_EVENT rows that have a space as a ae_multiple_recipient_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,528 rows where the ae_multiple_recipient_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_recipient_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_recipient_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#69) There are duplicate year*community records in AGGRESSION_EVENT_LOG ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 11 duplicate pairs of year*community records in AGGRESSION_EVENT_LOG. There can be, at most, one row per-community, per-year. The duplicates each have a `b_rec_english` value of `ALL` or `F-F (ALL); F-M (ALL); M-M (ALL)`.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event_log&lt;br /&gt;
JOIN (&lt;br /&gt;
  SELECT&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  FROM&lt;br /&gt;
    clean.aggression_event_log&lt;br /&gt;
  GROUP BY&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  HAVING count(*) &amp;gt; 1&lt;br /&gt;
  ) AS dups ON (&lt;br /&gt;
  dups.year          = aggression_event_log.year&lt;br /&gt;
  AND dups.community = aggression_event_log.community&lt;br /&gt;
)&lt;br /&gt;
order by &lt;br /&gt;
  aggression_event_log.year,&lt;br /&gt;
  aggression_event_log.community ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#70) There are AGGRESSION_EVENT rows that do not have a matching FOLLOW for the same focal ID and date. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 1494 records where clean.aggression_event does not have a matching row in clean.follow for the same focal ID and date.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH problem_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ae.*,&lt;br /&gt;
        EXISTS (&lt;br /&gt;
            SELECT 1&lt;br /&gt;
            FROM clean.follow f_id&lt;br /&gt;
            WHERE f_id.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
        ) AS focal_id_exists_in_follow&lt;br /&gt;
    FROM clean.aggression_event ae&lt;br /&gt;
    WHERE NOT EXISTS (&lt;br /&gt;
        SELECT 1&lt;br /&gt;
        FROM clean.follow f&lt;br /&gt;
        WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
          AND f.fol_date = ae.ae_date&lt;br /&gt;
    )&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_time,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_b_aggressor_id,&lt;br /&gt;
    pr.ae_b_recipient_id,&lt;br /&gt;
    pr.ae_source,&lt;br /&gt;
    pr.ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
        WHEN pr.focal_id_exists_in_follow THEN &amp;#039;missing_follow_on_same_date&amp;#039;&lt;br /&gt;
        ELSE &amp;#039;focal_id_not_found_in_follow&amp;#039;&lt;br /&gt;
    END AS issue_type&lt;br /&gt;
FROM problem_rows pr&lt;br /&gt;
ORDER BY&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#71) There are AGGRESSION_EVENT animal ids that are not reflected in among FOLLOW animal ids. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 16 unique(!) aggression_event.ae_fol_b_focal_id records that are not reflected among follow.fol_b_animid values.&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
All of the problematic records are reflected in the rows excluded as part of Problem #70. As such, there is not a separate exclusion clause in the conversion process for these data. As a correlary, addressing all issues in Problem #70 would address all Problem #71 infractions well.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fol_b_focal_id NOT IN (&lt;br /&gt;
    SELECT DISTINCT fol_b_animid&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE fol_b_animid IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
GROUP BY ae_fol_b_focal_id&lt;br /&gt;
ORDER BY&lt;br /&gt;
  row_count DESC,&lt;br /&gt;
  ae_fol_b_focal_id&lt;br /&gt;
;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#72) There are AGGRESSION_EVENT recipient certainty flags other than `Y` or `N` (required). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 28,322 AGGRESSION_EVENT rows that have a ae_recipient_certainty_flag other than `Y` or `N` as required.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
ae.*,&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;) ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag,&lt;br /&gt;
COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;)&lt;br /&gt;
GROUP BY COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;)&lt;br /&gt;
ORDER BY row_count DESC, raw_flag;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039; &amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;N&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;NO&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
** &amp;#039;Y%&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#73) There are AGGRESSION_EVENT aggression_event.ae_b_recipient_id values that are NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5,820 AGGRESSION_EVENT rows for which the aggression_event.ae_b_recipient_id is NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_recipient_certainty_flag,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NULL&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#74) There are AGGRESSION_EVENT event flags (multiple columns) that have values other than X or NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
Among AGGRESSION_EVENT flags (ae_bad_observation_flag, ae_bristle_flag, ae_chase_flag, ae_contact_flag, ae_contact_flag, ae_decided_flag, ae_display_flag, ae_multiple_aggressor_flag, ae_multiple_recipient_flag, ae_vocal_flag, ae_vocal_flag) there are 2,864 rows that have a value other than `X` or NULL (required) for one or more of the flags.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
  ae.ae_date,&lt;br /&gt;
  ae.ae_time,&lt;br /&gt;
  ae.ae_fol_b_focal_id,&lt;br /&gt;
  ae.ae_b_aggressor_id,&lt;br /&gt;
  ae.ae_b_recipient_id,&lt;br /&gt;
  ae.ae_decided_flag,&lt;br /&gt;
  ae.ae_multiple_aggressor_flag,&lt;br /&gt;
  ae.ae_multiple_recipient_flag,&lt;br /&gt;
  ae.ae_bad_observation_flag,&lt;br /&gt;
  ae.ae_bristle_flag,&lt;br /&gt;
  ae.ae_display_flag,&lt;br /&gt;
  ae.ae_chase_flag,&lt;br /&gt;
  ae.ae_contact_flag,&lt;br /&gt;
  ae.ae_vocal_flag,&lt;br /&gt;
  ae.ae_source,&lt;br /&gt;
  ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
AND (&lt;br /&gt;
    (ae.ae_decided_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_decided_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_aggressor_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_aggressor_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_recipient_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_recipient_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bad_observation_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bad_observation_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bristle_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bristle_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_display_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_display_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_chase_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_chase_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_contact_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_contact_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_vocal_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_vocal_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
)&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH base AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
),&lt;br /&gt;
flag_values AS (&lt;br /&gt;
  SELECT &amp;#039;ae_decided_flag&amp;#039; AS flag_name, COALESCE(ae_decided_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_value FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_aggressor_flag&amp;#039;, COALESCE(ae_multiple_aggressor_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_recipient_flag&amp;#039;, COALESCE(ae_multiple_recipient_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bad_observation_flag&amp;#039;, COALESCE(ae_bad_observation_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bristle_flag&amp;#039;, COALESCE(ae_bristle_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_display_flag&amp;#039;, COALESCE(ae_display_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_chase_flag&amp;#039;, COALESCE(ae_chase_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_contact_flag&amp;#039;, COALESCE(ae_contact_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_vocal_flag&amp;#039;, COALESCE(ae_vocal_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  flag_name,&lt;br /&gt;
  raw_value,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM flag_values&lt;br /&gt;
WHERE raw_value &amp;lt;&amp;gt; &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;&lt;br /&gt;
  AND UPPER(BTRIM(raw_value)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;&lt;br /&gt;
GROUP BY flag_name, raw_value&lt;br /&gt;
ORDER BY flag_name, row_count DESC, raw_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* clean schema conversions (mimic patterns in Access):&lt;br /&gt;
** `?` to &amp;#039; &amp;#039;&lt;br /&gt;
** `Y%` to `X`&lt;br /&gt;
** `X%` to `X`&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039;&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
== * (#75) There are AGGRESSION_EVENT records where ae_extracted_by values are not in the people table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 28,513 AGGRESSION_EVENT records where ae_extracted_by does not match a person in the PEOPLE table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  s.ae_date,&lt;br /&gt;
  s.ae_time,&lt;br /&gt;
  s.ae_fol_b_focal_id,&lt;br /&gt;
  s.ae_b_aggressor_id,&lt;br /&gt;
  s.ae_b_recipient_id,&lt;br /&gt;
  s.ae_extracted_by,&lt;br /&gt;
  s.ae_source,&lt;br /&gt;
  s.ae_full_description,&lt;br /&gt;
  s.ae_comments&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
ORDER BY s.ae_date, s.ae_fol_b_focal_id, s.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;)) AS raw_extractedby,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, raw_extractedby;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#76) There are AGGRESSION_EVENT rows where the Actor/Actee are not in the biography table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9,439 combined (but see following!) rows where the ae_b_aggressor_id and/or ae_b_recipient_id is not in the BIOGRAPHY table.&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
* The row count is inflated by a CROSS LATERAL JOIN, which yields the number of combined, pivoted records for both ae_b_aggressor_id and ae_b_recipient, not the actual number of confounding AGGRESSION_EVENT rows.&lt;br /&gt;
* The queries currently exclude ae_b_recipient_id values that are NULL, which seems a related but separte problem; the number of rows jumps to 10,357 if ae_b_recipient_id = NULL are included.&lt;br /&gt;
* How are we treating ae_b_recipient_id values such as `group`, `males`, `females`, etc.?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- 2) Detailed rows against clean.biography&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    s.ae_date,&lt;br /&gt;
    s.ae_time,&lt;br /&gt;
    s.ae_fol_b_focal_id,&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant,&lt;br /&gt;
    s.ae_b_aggressor_id,&lt;br /&gt;
    s.ae_b_recipient_id,&lt;br /&gt;
    s.ae_source,&lt;br /&gt;
    s.ae_full_description,&lt;br /&gt;
    s.ae_comments&lt;br /&gt;
FROM scoped s&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, BTRIM(COALESCE(s.ae_b_aggressor_id, &amp;#039;&amp;#039;))),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, BTRIM(COALESCE(s.ae_b_recipient_id, &amp;#039;&amp;#039;)))&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
WHERE v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY s.ae_date, s.ae_fol_b_focal_id, s.ae_time, v.role_name;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- 1) Summary against clean.biography&lt;br /&gt;
WITH role_values AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      &amp;#039;Actor&amp;#039;::text AS role_name,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;)) AS participant&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
&lt;br /&gt;
  UNION ALL&lt;br /&gt;
&lt;br /&gt;
  SELECT&lt;br /&gt;
      &amp;#039;Actee&amp;#039;::text AS role_name,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;)) AS participant&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    rv.role_name,&lt;br /&gt;
    rv.participant,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM role_values rv&lt;br /&gt;
WHERE rv.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = rv.participant&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY rv.role_name, rv.participant&lt;br /&gt;
ORDER BY rv.role_name, row_count DESC, rv.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#77) There are AGGRESSION_EVENT rows where ae_recipient_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9,102 records where ae_recipient_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_recipient_behavior IS NULL&lt;br /&gt;
  OR ae_recipient_behavior = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#78) There are AGGRESSION_EVENT rows where ae_full_description is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,718 records where ae_full_description is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_full_description IS NULL&lt;br /&gt;
  OR ae_full_description = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=589</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=589"/>
		<updated>2026-05-19T19:13:23Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* Solution */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
Confirm CT and SG were with CA on 10/23/1980 from brec swahili&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRE: still need to address: UWE, GGl, FAC, FAR, GU, and Sl (as of the April 2026 dump)&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 9c114ac.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema. Addressed 8c05232.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#65) There are AGGRESSION_EVENT rows that have &amp;quot;YES&amp;quot; or a space as a ae_bad_observeration_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 3,079 rows where the ae_bad_observation_flag is a space, and 1 row where it is &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_bad_observation_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_bad_observation_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;YES&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;() as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#66) There are AGGRESSION_EVENT rows that have a space as a ae_decided_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,041 rows where the ae_decided_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_decided_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_decided_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#67) There are AGGRESSION_EVENT rows that have a space or an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; as a ae_multiple_aggressor_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,397 rows where the ae_multiple_aggressor_flag is a space and 2 rows where the value is an &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_aggressor_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_aggressor_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;x&amp;lt;/code&amp;gt; (along with &amp;lt;code&amp;gt;X&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;TRUE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#68) There are AGGRESSION_EVENT rows that have a space as a ae_multiple_recipient_flag value ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2,528 rows where the ae_multiple_recipient_flag is a space.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT COALESCE(ae.ae_multiple_recipient_flag, &amp;#039;NULL&amp;#039;), count(*)&lt;br /&gt;
  FROM easy.aggression_event AS ae&lt;br /&gt;
  GROUP BY ae.ae_multiple_recipient_flag;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Treat spaces (along with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;) as &amp;lt;code&amp;gt;FALSE&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== * (#69) There are duplicate year*community records in AGGRESSION_EVENT_LOG ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 11 duplicate pairs of year*community records in AGGRESSION_EVENT_LOG. There can be, at most, one row per-community, per-year. The duplicates each have a `b_rec_english` value of `ALL` or `F-F (ALL); F-M (ALL); M-M (ALL)`.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event_log&lt;br /&gt;
JOIN (&lt;br /&gt;
  SELECT&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  FROM&lt;br /&gt;
    clean.aggression_event_log&lt;br /&gt;
  GROUP BY&lt;br /&gt;
    year,&lt;br /&gt;
    community&lt;br /&gt;
  HAVING count(*) &amp;gt; 1&lt;br /&gt;
  ) AS dups ON (&lt;br /&gt;
  dups.year          = aggression_event_log.year&lt;br /&gt;
  AND dups.community = aggression_event_log.community&lt;br /&gt;
)&lt;br /&gt;
order by &lt;br /&gt;
  aggression_event_log.year,&lt;br /&gt;
  aggression_event_log.community ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#70) There are AGGRESSION_EVENT rows that do not have a matching FOLLOW for the same focal ID and date. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 1494 records where clean.aggression_event does not have a matching row in clean.follow for the same focal ID and date.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH problem_rows AS (&lt;br /&gt;
    SELECT&lt;br /&gt;
        ae.*,&lt;br /&gt;
        EXISTS (&lt;br /&gt;
            SELECT 1&lt;br /&gt;
            FROM clean.follow f_id&lt;br /&gt;
            WHERE f_id.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
        ) AS focal_id_exists_in_follow&lt;br /&gt;
    FROM clean.aggression_event ae&lt;br /&gt;
    WHERE NOT EXISTS (&lt;br /&gt;
        SELECT 1&lt;br /&gt;
        FROM clean.follow f&lt;br /&gt;
        WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
          AND f.fol_date = ae.ae_date&lt;br /&gt;
    )&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_time,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_b_aggressor_id,&lt;br /&gt;
    pr.ae_b_recipient_id,&lt;br /&gt;
    pr.ae_source,&lt;br /&gt;
    pr.ae_full_description,&lt;br /&gt;
    CASE&lt;br /&gt;
        WHEN pr.focal_id_exists_in_follow THEN &amp;#039;missing_follow_on_same_date&amp;#039;&lt;br /&gt;
        ELSE &amp;#039;focal_id_not_found_in_follow&amp;#039;&lt;br /&gt;
    END AS issue_type&lt;br /&gt;
FROM problem_rows pr&lt;br /&gt;
ORDER BY&lt;br /&gt;
    pr.ae_date,&lt;br /&gt;
    pr.ae_fol_b_focal_id,&lt;br /&gt;
    pr.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#71) There are AGGRESSION_EVENT animal ids that are not reflected in among FOLLOW animal ids. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 16 unique(!) aggression_event.ae_fol_b_focal_id records that are not reflected among follow.fol_b_animid values.&lt;br /&gt;
&lt;br /&gt;
=== Processing note ===&lt;br /&gt;
All of the problematic records are reflected in the rows excluded as part of Problem #70. As such, there is not a separate exclusion clause in the conversion process for these data. As a correlary, addressing all issues in Problem #70 would address all Problem #71 infractions well.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae_fol_b_focal_id,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE ae_fol_b_focal_id NOT IN (&lt;br /&gt;
    SELECT DISTINCT fol_b_animid&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE fol_b_animid IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
GROUP BY ae_fol_b_focal_id&lt;br /&gt;
ORDER BY&lt;br /&gt;
  row_count DESC,&lt;br /&gt;
  ae_fol_b_focal_id&lt;br /&gt;
;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#72) There are AGGRESSION_EVENT recipient certainty flags other than `Y` or `N` (required). ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 28,322 AGGRESSION_EVENT rows that have a ae_recipient_certainty_flag other than `Y` or `N` as required.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
ae.*,&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;) ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_flag,&lt;br /&gt;
COUNT(*) AS row_count&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM clean.follow f&lt;br /&gt;
  WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
  AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND UPPER(BTRIM(COALESCE(ae.ae_recipient_certainty_flag, &amp;#039;&amp;#039;))) NOT IN (&amp;#039;N&amp;#039;, &amp;#039;Y&amp;#039;)&lt;br /&gt;
GROUP BY COALESCE(ae_recipient_certainty_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;)&lt;br /&gt;
ORDER BY row_count DESC, raw_flag;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039; &amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;N&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;NO&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
** &amp;#039;Y%&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#73) There are AGGRESSION_EVENT aggression_event.ae_b_recipient_id values that are NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5,820 AGGRESSION_EVENT rows for which the aggression_event.ae_b_recipient_id is NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
    ae.ae_date,&lt;br /&gt;
    ae.ae_time,&lt;br /&gt;
    ae.ae_fol_b_focal_id,&lt;br /&gt;
    ae.ae_b_aggressor_id,&lt;br /&gt;
    ae.ae_b_recipient_id,&lt;br /&gt;
    ae.ae_recipient_certainty_flag,&lt;br /&gt;
    ae.ae_full_description,&lt;br /&gt;
    ae.ae_source,&lt;br /&gt;
    ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NULL&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#74) There are AGGRESSION_EVENT event flags (multiple columns) that have values other than X or NULL. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
Among AGGRESSION_EVENT flags (ae_bad_observation_flag, ae_bristle_flag, ae_chase_flag, ae_contact_flag, ae_contact_flag, ae_decided_flag, ae_display_flag, ae_multiple_aggressor_flag, ae_multiple_recipient_flag, ae_vocal_flag, ae_vocal_flag) there are 2,864 rows that have a value other than `X` or NULL (required) for one or more of the flags.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT&lt;br /&gt;
  ae.ae_date,&lt;br /&gt;
  ae.ae_time,&lt;br /&gt;
  ae.ae_fol_b_focal_id,&lt;br /&gt;
  ae.ae_b_aggressor_id,&lt;br /&gt;
  ae.ae_b_recipient_id,&lt;br /&gt;
  ae.ae_decided_flag,&lt;br /&gt;
  ae.ae_multiple_aggressor_flag,&lt;br /&gt;
  ae.ae_multiple_recipient_flag,&lt;br /&gt;
  ae.ae_bad_observation_flag,&lt;br /&gt;
  ae.ae_bristle_flag,&lt;br /&gt;
  ae.ae_display_flag,&lt;br /&gt;
  ae.ae_chase_flag,&lt;br /&gt;
  ae.ae_contact_flag,&lt;br /&gt;
  ae.ae_vocal_flag,&lt;br /&gt;
  ae.ae_source,&lt;br /&gt;
  ae.ae_comments&lt;br /&gt;
FROM clean.aggression_event ae&lt;br /&gt;
WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
)&lt;br /&gt;
AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
AND (&lt;br /&gt;
    (ae.ae_decided_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_decided_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_aggressor_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_aggressor_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_multiple_recipient_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_multiple_recipient_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bad_observation_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bad_observation_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_bristle_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_bristle_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_display_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_display_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_chase_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_chase_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_contact_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_contact_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
 OR (ae.ae_vocal_flag IS NOT NULL AND UPPER(BTRIM(ae.ae_vocal_flag)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;)&lt;br /&gt;
)&lt;br /&gt;
ORDER BY ae.ae_date, ae.ae_fol_b_focal_id, ae.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH base AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
),&lt;br /&gt;
flag_values AS (&lt;br /&gt;
  SELECT &amp;#039;ae_decided_flag&amp;#039; AS flag_name, COALESCE(ae_decided_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) AS raw_value FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_aggressor_flag&amp;#039;, COALESCE(ae_multiple_aggressor_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_multiple_recipient_flag&amp;#039;, COALESCE(ae_multiple_recipient_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bad_observation_flag&amp;#039;, COALESCE(ae_bad_observation_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_bristle_flag&amp;#039;, COALESCE(ae_bristle_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_display_flag&amp;#039;, COALESCE(ae_display_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_chase_flag&amp;#039;, COALESCE(ae_chase_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_contact_flag&amp;#039;, COALESCE(ae_contact_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
  UNION ALL&lt;br /&gt;
  SELECT &amp;#039;ae_vocal_flag&amp;#039;, COALESCE(ae_vocal_flag, &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;) FROM base&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  flag_name,&lt;br /&gt;
  raw_value,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM flag_values&lt;br /&gt;
WHERE raw_value &amp;lt;&amp;gt; &amp;#039;&amp;lt;NULL&amp;gt;&amp;#039;&lt;br /&gt;
  AND UPPER(BTRIM(raw_value)) &amp;lt;&amp;gt; &amp;#039;X&amp;#039;&lt;br /&gt;
GROUP BY flag_name, raw_value&lt;br /&gt;
ORDER BY flag_name, row_count DESC, raw_value;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ian or Elizabeth to please confirm.&lt;br /&gt;
&lt;br /&gt;
* clean schema conversions (mimic patterns in Access):&lt;br /&gt;
** `?` to &amp;#039; &amp;#039;&lt;br /&gt;
** `Y%` to `X`&lt;br /&gt;
** `X%` to `X`&lt;br /&gt;
&lt;br /&gt;
* sokwe conversions:&lt;br /&gt;
** &amp;#039;&amp;#039; to &amp;#039;0&amp;#039;&lt;br /&gt;
** &amp;#039;X&amp;#039; to &amp;#039;1&amp;#039;&lt;br /&gt;
&lt;br /&gt;
== * (#75) There are AGGRESSION_EVENT records where ae_extracted_by values are not in the people table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 28,513 AGGRESSION_EVENT records where ae_extracted_by does not match a person in the PEOPLE table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  s.ae_date,&lt;br /&gt;
  s.ae_time,&lt;br /&gt;
  s.ae_fol_b_focal_id,&lt;br /&gt;
  s.ae_b_aggressor_id,&lt;br /&gt;
  s.ae_b_recipient_id,&lt;br /&gt;
  s.ae_extracted_by,&lt;br /&gt;
  s.ae_source,&lt;br /&gt;
  s.ae_full_description,&lt;br /&gt;
  s.ae_comments&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
ORDER BY s.ae_date, s.ae_fol_b_focal_id, s.ae_time;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
  BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;)) AS raw_extractedby,&lt;br /&gt;
  COUNT(*) AS row_count&lt;br /&gt;
FROM scoped s&lt;br /&gt;
WHERE NOT EXISTS (&lt;br /&gt;
  SELECT 1&lt;br /&gt;
  FROM people p&lt;br /&gt;
  WHERE p.person = BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
     OR LOWER(p.name) = LOWER(BTRIM(COALESCE(s.ae_extracted_by, &amp;#039;&amp;#039;)))&lt;br /&gt;
)&lt;br /&gt;
GROUP BY BTRIM(COALESCE(ae_extracted_by, &amp;#039;&amp;#039;))&lt;br /&gt;
ORDER BY row_count DESC, raw_extractedby;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#76) There are AGGRESSION_EVENT rows where the Actor/Actee are not in the biography table. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9,439 combined (but see following!) rows where the ae_b_aggressor_id and/or ae_b_recipient_id is not in the BIOGRAPHY table.&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
* The row count is inflated by a CROSS LATERAL JOIN, which yields the number of combined, pivoted records for both ae_b_aggressor_id and ae_b_recipient, not the actual number of confounding AGGRESSION_EVENT rows.&lt;br /&gt;
* The queries currently exclude ae_b_recipient_id values that are NULL, which seems a related but separte problem; the number of rows jumps to 10,357 if ae_b_recipient_id = NULL are included.&lt;br /&gt;
* How are we treating ae_b_recipient_id values such as `group`, `males`, `females`, etc.?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
==== Full record ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- 2) Detailed rows against clean.biography&lt;br /&gt;
WITH scoped AS (&lt;br /&gt;
  SELECT ae.*&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    s.ae_date,&lt;br /&gt;
    s.ae_time,&lt;br /&gt;
    s.ae_fol_b_focal_id,&lt;br /&gt;
    v.role_name,&lt;br /&gt;
    v.participant,&lt;br /&gt;
    s.ae_b_aggressor_id,&lt;br /&gt;
    s.ae_b_recipient_id,&lt;br /&gt;
    s.ae_source,&lt;br /&gt;
    s.ae_full_description,&lt;br /&gt;
    s.ae_comments&lt;br /&gt;
FROM scoped s&lt;br /&gt;
CROSS JOIN LATERAL (&lt;br /&gt;
  VALUES&lt;br /&gt;
    (&amp;#039;Actor&amp;#039;::text, BTRIM(COALESCE(s.ae_b_aggressor_id, &amp;#039;&amp;#039;))),&lt;br /&gt;
    (&amp;#039;Actee&amp;#039;::text, BTRIM(COALESCE(s.ae_b_recipient_id, &amp;#039;&amp;#039;)))&lt;br /&gt;
) AS v(role_name, participant)&lt;br /&gt;
WHERE v.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = v.participant&lt;br /&gt;
  )&lt;br /&gt;
ORDER BY s.ae_date, s.ae_fol_b_focal_id, s.ae_time, v.role_name;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Summary of non-compliant values ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- 1) Summary against clean.biography&lt;br /&gt;
WITH role_values AS (&lt;br /&gt;
  SELECT&lt;br /&gt;
      &amp;#039;Actor&amp;#039;::text AS role_name,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_aggressor_id, &amp;#039;&amp;#039;)) AS participant&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
&lt;br /&gt;
  UNION ALL&lt;br /&gt;
&lt;br /&gt;
  SELECT&lt;br /&gt;
      &amp;#039;Actee&amp;#039;::text AS role_name,&lt;br /&gt;
      BTRIM(COALESCE(ae.ae_b_recipient_id, &amp;#039;&amp;#039;)) AS participant&lt;br /&gt;
  FROM clean.aggression_event ae&lt;br /&gt;
  WHERE EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.follow f&lt;br /&gt;
    WHERE f.fol_b_animid = ae.ae_fol_b_focal_id&lt;br /&gt;
      AND f.fol_date = ae.ae_date&lt;br /&gt;
  )&lt;br /&gt;
  AND ae.ae_b_recipient_id IS NOT NULL&lt;br /&gt;
)&lt;br /&gt;
SELECT&lt;br /&gt;
    rv.role_name,&lt;br /&gt;
    rv.participant,&lt;br /&gt;
    COUNT(*) AS row_count&lt;br /&gt;
FROM role_values rv&lt;br /&gt;
WHERE rv.participant &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
  AND NOT EXISTS (&lt;br /&gt;
    SELECT 1&lt;br /&gt;
    FROM clean.biography b&lt;br /&gt;
    WHERE b.b_animid = rv.participant&lt;br /&gt;
  )&lt;br /&gt;
GROUP BY rv.role_name, rv.participant&lt;br /&gt;
ORDER BY rv.role_name, row_count DESC, rv.participant;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#77) There are AGGRESSION_EVENT rows where ae_recipient_behavior is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9,102 records where ae_recipient_behavior is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_recipient_behavior IS NULL&lt;br /&gt;
  OR ae_recipient_behavior = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#78) There are AGGRESSION_EVENT rows where ae_full_description is NULL or an empty string. ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,718 records where ae_full_description is NULL or an empty string.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
FROM clean.aggression_event&lt;br /&gt;
WHERE&lt;br /&gt;
  ae_full_description IS NULL&lt;br /&gt;
  OR ae_full_description = &amp;#039;&amp;#039; ;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=557</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=557"/>
		<updated>2026-05-07T23:19:42Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* Solution */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
5/7/2026: ICG checked through 2003. When there was a &amp;quot;?&amp;quot; for swelling, I entered &amp;#039;0&amp;#039;. If there was actually a &amp;#039;U&amp;#039; on the tiki itself, I left &amp;quot;U&amp;quot; in the FA table. Note that a lot of these females are MGF, so the U might be legit.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=556</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=556"/>
		<updated>2026-05-06T21:48:10Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* Solution */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
FIXED MANY ROWS IN MS ACCESS&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
Need to investigate further - brec swahili?&lt;br /&gt;
&lt;br /&gt;
For now, allow these rows&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=555</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=555"/>
		<updated>2026-05-05T23:09:14Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* Solution */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;732&amp;lt;/s&amp;gt; 276 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
5/5/2026 ICG&lt;br /&gt;
Fixed 194 rows in MS ACCESS.&lt;br /&gt;
     2015-02-28 – Manually added FU follow data into FOLLOW.  66 rows&lt;br /&gt;
     20-Nov-2014 – Manually added WEM follow data into Follow. 30 rows &lt;br /&gt;
     19-Aug-2014 DB follow was mistakenly entered as 18-Aug. 27 rows&lt;br /&gt;
     1-Jul-2014 Focal in FOLLOW was MGF. On tiki, WEM written in. Same follow. 46 rows&lt;br /&gt;
     18-Jun-2014 same issue. 25 rows.&lt;br /&gt;
&lt;br /&gt;
Most of the rest are hand-entered juvenile arrivals from Brec. Spot checks have Brec notes but no physical tiki&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=548</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=548"/>
		<updated>2026-03-25T22:51:50Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#64) There are follow_arrival rows with NULL fa_type_of_certainty values */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Ignore for now. Eventually fix by hand.&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=547</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=547"/>
		<updated>2026-03-25T22:46:37Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#60) Invalid biography_update_log.made_by values */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Ignore for now. Eventually fix by hand.&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=546</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=546"/>
		<updated>2026-03-25T22:01:58Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#46) There are follow_arrivals where non-females have a cycle code that is other than n/a */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Ignore for now. Eventually fix by hand.&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO CHANGE THE SINGLE ENTRY TO EVL&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=545</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=545"/>
		<updated>2026-03-25T22:01:25Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* Solution */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Ignore for now. Eventually fix by hand.&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO CHANGE THE SINGLE ENTRY TO EVL&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=544</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=544"/>
		<updated>2026-02-16T17:01:51Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#63) There are follow_arrival rows with NULL fa_data_source values */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Ignore for now. Eventually fix by hand.&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO CHANGE THE SINGLE ENTRY TO EVL&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=543</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=543"/>
		<updated>2026-02-16T17:01:23Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* Solution */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Ignore for now. Eventually fix by hand.&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO CHANGE THE SINGLE ENTRY TO EVL&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=542</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=542"/>
		<updated>2026-02-16T16:59:59Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#61) Invalid follow date/focals pairs in follow_arrival */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Ignore for now. Eventually fix by hand.&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO CHANGE THE SINGLE ENTRY TO EVL&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=541</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=541"/>
		<updated>2026-02-16T16:58:41Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#60) Invalid biography_update_log.made_by values */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Ignore for now. Eventually fix by hand.&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO CHANGE THE SINGLE ENTRY TO EVL&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=540</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=540"/>
		<updated>2026-02-16T16:57:26Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#59) Zero BIOGRAPHY.b_animid_num values */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Ignore for now. Eventually fix by hand.&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=539</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=539"/>
		<updated>2026-02-16T16:56:45Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#58) OTHER_SPECIES duplicate keys */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Ignore for now. Eventually fix by hand.&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== * (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=538</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=538"/>
		<updated>2026-02-16T16:55:10Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#57) GROOM_BOUT duplicate keys */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Ignore for now. Eventually fix by hand.&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=537</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=537"/>
		<updated>2026-02-16T16:52:47Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#55) There are community_membership rows that place an individual in a community before birth */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Ignore for now. Eventually fix by hand.&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=536</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=536"/>
		<updated>2026-02-16T16:52:05Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not 0, U, or MISS */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Ignore for now. Eventually fix by hand.&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=535</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=535"/>
		<updated>2026-02-16T16:47:37Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* Solution */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Ignore for now. Eventually fix by hand.&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=534</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=534"/>
		<updated>2026-02-16T16:44:38Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Ignore for now. Eventually fix by hand.&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=533</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=533"/>
		<updated>2026-02-16T16:41:21Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#46) There are follow_arrivals where non-females have a cycle code that is other than n/a */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Ignore for now. Eventually fix by hand.&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=532</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=532"/>
		<updated>2026-02-16T16:36:10Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#45) The follow_arrival.fa_update column is not preserved */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Ignore for now. Eventually fix by hand.&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Allow males to have a cycle code of &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change non-female&amp;#039;s who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=531</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=531"/>
		<updated>2026-02-16T16:33:46Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#44) There are follow arrivals where the arriving chimp arrives before being under study */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Ignore for now. Eventually fix by hand.&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Allow males to have a cycle code of &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change non-female&amp;#039;s who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=530</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=530"/>
		<updated>2026-02-16T16:28:54Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#43) There are follow arrivals with no related follow */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Ignore for now. Eventually fix by hand.&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Allow males to have a cycle code of &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change non-female&amp;#039;s who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=529</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=529"/>
		<updated>2026-02-16T16:25:53Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#41)There are follow arrivals with NULL nesting information */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Allow males to have a cycle code of &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change non-female&amp;#039;s who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=528</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=528"/>
		<updated>2026-02-16T16:25:10Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#35) There are follows done before a focal was under study */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Allow males to have a cycle code of &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change non-female&amp;#039;s who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=527</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=527"/>
		<updated>2026-02-16T16:24:38Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#34) There are duplicate animid, date combinations on FOLLOW */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== * (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Allow males to have a cycle code of &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change non-female&amp;#039;s who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=526</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=526"/>
		<updated>2026-02-16T16:24:05Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#33) Some follows have an animid that does not exist, even after animid cleanup */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== * (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Allow males to have a cycle code of &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change non-female&amp;#039;s who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=525</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=525"/>
		<updated>2026-02-16T16:23:05Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#30) Some follows have no community */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Allow males to have a cycle code of &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change non-female&amp;#039;s who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=524</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=524"/>
		<updated>2026-02-16T16:22:11Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#27) Mismatch of end-in-nest on follow and follow_arrival */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Allow males to have a cycle code of &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change non-female&amp;#039;s who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=523</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=523"/>
		<updated>2026-02-16T16:21:39Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#26) Mismatch of start-in-nest on follow and follow_arrival */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
== * (#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
If either table says the focal was in a nest, then have the focal be in a nest.&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Allow males to have a cycle code of &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change non-female&amp;#039;s who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=522</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=522"/>
		<updated>2026-02-16T16:17:17Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#25) Follow ends are not last arrivals */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== * (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
If either table says the focal was in a nest, then have the focal be in a nest.&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
== * (#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
If either table says the focal was in a nest, then have the focal be in a nest.&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Allow males to have a cycle code of &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change non-female&amp;#039;s who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=521</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=521"/>
		<updated>2026-02-16T16:16:50Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#24) Follow starts are not first arrivals */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
== * (#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
TENTATIVE: This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== * (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
If either table says the focal was in a nest, then have the focal be in a nest.&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
== * (#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
If either table says the focal was in a nest, then have the focal be in a nest.&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Allow males to have a cycle code of &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change non-female&amp;#039;s who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=520</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=520"/>
		<updated>2026-02-16T16:07:45Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* Solution */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== * (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
TENTATIVE: This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
== * (#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
TENTATIVE: This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== * (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
If either table says the focal was in a nest, then have the focal be in a nest.&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
== * (#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
If either table says the focal was in a nest, then have the focal be in a nest.&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Allow males to have a cycle code of &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change non-female&amp;#039;s who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=472</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=472"/>
		<updated>2025-10-31T16:45:40Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* Bad solution */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== * (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
TENTATIVE: This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
== * (#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
TENTATIVE: This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== * (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 1,848 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
If either table says the focal was in a nest, then have the focal be in a nest.&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
== * (#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
If either table says the focal was in a nest, then have the focal be in a nest.&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of 9 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 217 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 169 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Allow males to have a cycle code of &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change non-female&amp;#039;s who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 239 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 127 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 688 rows, having only 3 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 26 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=471</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=471"/>
		<updated>2025-10-31T16:45:19Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#55) There are community_membership rows that place an individual in a community before birth */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== * (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
TENTATIVE: This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
== * (#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
TENTATIVE: This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== * (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 1,848 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
If either table says the focal was in a nest, then have the focal be in a nest.&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
== * (#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
If either table says the focal was in a nest, then have the focal be in a nest.&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of 9 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 217 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 169 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Allow males to have a cycle code of &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change non-female&amp;#039;s who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 239 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 127 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 688 rows, having only 3 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 26 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN birthdate in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=465</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=465"/>
		<updated>2025-10-23T20:27:11Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* Solution */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== * (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
TENTATIVE: This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
== * (#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
TENTATIVE: This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== * (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 1,848 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
If either table says the focal was in a nest, then have the focal be in a nest.&lt;br /&gt;
&lt;br /&gt;
== * (#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
If either table says the focal was in a nest, then have the focal be in a nest.&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of 9 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 217 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 169 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Allow males to have a cycle code of &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change non-female&amp;#039;s who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 239 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 127 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 688 rows, having only 3 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 26 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=457</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=457"/>
		<updated>2025-10-22T22:37:56Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#35) There are follows done before a focal was under study */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== * (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
TENTATIVE: This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
== * (#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
TENTATIVE: This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== * (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 1,848 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM clean.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN clean.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM clean.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN clean.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of 9 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 217 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 169 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Allow males to have a cycle code of &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change non-female&amp;#039;s who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 239 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 127 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 688 rows, having only 3 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fix in the MS Access data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 26 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=456</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=456"/>
		<updated>2025-10-22T22:36:16Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#41)There are follow arrivals with NULL nesting information */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== * (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
TENTATIVE: This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
== * (#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
TENTATIVE: This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== * (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 1,848 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM clean.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN clean.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM clean.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN clean.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of 9 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 217 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 169 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Allow males to have a cycle code of &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change non-female&amp;#039;s who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 239 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 127 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 688 rows, having only 3 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fix in the MS Access data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 26 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=455</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=455"/>
		<updated>2025-10-22T22:35:14Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#34) There are duplicate animid, date combinations on FOLLOW */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== * (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
TENTATIVE: This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
== * (#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
TENTATIVE: This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== * (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 1,848 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM clean.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN clean.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM clean.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN clean.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of 9 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 217 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 169 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Allow males to have a cycle code of &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change non-female&amp;#039;s who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 239 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 127 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 688 rows, having only 3 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fix in the MS Access data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 26 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=454</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=454"/>
		<updated>2025-10-22T22:34:01Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#33) Some follows have an animid that does not exist, even after animid cleanup */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== * (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
TENTATIVE: This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
== * (#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
TENTATIVE: This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== * (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 1,848 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM clean.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN clean.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM clean.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN clean.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of 9 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 217 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 169 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Allow males to have a cycle code of &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change non-female&amp;#039;s who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 239 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 127 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 688 rows, having only 3 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fix in the MS Access data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 26 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=453</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=453"/>
		<updated>2025-10-22T22:33:11Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#30) Some follows have no community */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== * (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
TENTATIVE: This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
== * (#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
TENTATIVE: This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== * (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 1,848 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM clean.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN clean.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM clean.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN clean.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of 9 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 217 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 169 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Allow males to have a cycle code of &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change non-female&amp;#039;s who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 239 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 127 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 688 rows, having only 3 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fix in the MS Access data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 26 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=338</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=338"/>
		<updated>2024-02-15T19:21:52Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#13) 234 COMM_MEMBS rows place individuals in a community before their BIOGRAPHY.StartDate */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
== * (#13) 234 COMM_MEMBS rows place individuals in a community before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
234 COMM_MEMBS rows place individuals into a community before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_entrydate;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) 6 COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
6 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_departdate, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== * (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Have the conversion program change all the NULL values to &amp;lt;nowiki&amp;gt;&amp;#039;&amp;#039;&amp;lt;/nowiki&amp;gt;, the empty string.&lt;br /&gt;
&lt;br /&gt;
== * (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
need to talk to Karl&lt;br /&gt;
&lt;br /&gt;
== * (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== * (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Talk to Karl&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ==&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=337</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=337"/>
		<updated>2024-02-15T19:14:13Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
== * (#13) 234 COMM_MEMBS rows place individuals in a community before their BIOGRAPHY.StartDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
234 COMM_MEMBS rows place individuals into a community before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_entrydate;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) 6 COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
6 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_departdate, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== * (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Have the conversion program change all the NULL values to &amp;lt;nowiki&amp;gt;&amp;#039;&amp;#039;&amp;lt;/nowiki&amp;gt;, the empty string.&lt;br /&gt;
&lt;br /&gt;
== * (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
need to talk to Karl&lt;br /&gt;
&lt;br /&gt;
== * (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== * (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Talk to Karl&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ==&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=334</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=334"/>
		<updated>2024-02-08T20:59:43Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Add KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
&lt;br /&gt;
== * (#13) 234 COMM_MEMBS rows place individuals in a community before their BIOGRAPHY.StartDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
234 COMM_MEMBS rows place individuals into a community before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_entrydate;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) 6 COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
6 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_departdate, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== * (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Have the conversion program change all the NULL values to &amp;lt;nowiki&amp;gt;&amp;#039;&amp;#039;&amp;lt;/nowiki&amp;gt;, the empty string.&lt;br /&gt;
&lt;br /&gt;
== * (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
need to talk to Karl&lt;br /&gt;
&lt;br /&gt;
== * (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== * (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Talk to Karl&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ==&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=333</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=333"/>
		<updated>2024-02-08T20:55:09Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* Solution */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Add KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
&lt;br /&gt;
== * (#13) 234 COMM_MEMBS rows place individuals in a community before their BIOGRAPHY.StartDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
234 COMM_MEMBS rows place individuals into a community before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_entrydate;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) 6 COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
6 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_departdate, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== * (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Have the conversion program change all the NULL values to &amp;lt;nowiki&amp;gt;&amp;#039;&amp;#039;&amp;lt;/nowiki&amp;gt;, the empty string.&lt;br /&gt;
&lt;br /&gt;
== * (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
need to talk to Karl&lt;br /&gt;
&lt;br /&gt;
== * (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== * (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Talk to Karl&lt;br /&gt;
&lt;br /&gt;
== * (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=332</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=332"/>
		<updated>2024-02-08T20:54:48Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#21) There are 6 rows in BIOGRAPHY_LOG where the Description is NULL */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Add KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
&lt;br /&gt;
== * (#13) 234 COMM_MEMBS rows place individuals in a community before their BIOGRAPHY.StartDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
234 COMM_MEMBS rows place individuals into a community before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_entrydate;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) 6 COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
6 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_departdate, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== * (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Have the conversion program change all the NULL values to &amp;lt;nowiki&amp;gt;&amp;#039;&amp;#039;&amp;lt;/nowiki&amp;gt;, the empty string.&lt;br /&gt;
&lt;br /&gt;
== * (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
need to talk to Karl&lt;br /&gt;
&lt;br /&gt;
== * (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== * (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
</feed>