<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://sokwe.janegoodall.org/w/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=IanGilby</id>
	<title>sokwedb - User contributions [en]</title>
	<link rel="self" type="application/atom+xml" href="https://sokwe.janegoodall.org/w/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=IanGilby"/>
	<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/wiki/Special:Contributions/IanGilby"/>
	<updated>2026-04-09T03:39:08Z</updated>
	<subtitle>User contributions</subtitle>
	<generator>MediaWiki 1.35.6</generator>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=548</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=548"/>
		<updated>2026-03-25T22:51:50Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#64) There are follow_arrival rows with NULL fa_type_of_certainty values */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Ignore for now. Eventually fix by hand.&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN fixed in Access 3/25/2026&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=547</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=547"/>
		<updated>2026-03-25T22:46:37Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#60) Invalid biography_update_log.made_by values */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Ignore for now. Eventually fix by hand.&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN CHANGED THE SINGLE SF/EVL ENTRY TO EVL IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=546</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=546"/>
		<updated>2026-03-25T22:01:58Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#46) There are follow_arrivals where non-females have a cycle code that is other than n/a */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Ignore for now. Eventually fix by hand.&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO CHANGE THE SINGLE ENTRY TO EVL&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=545</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=545"/>
		<updated>2026-03-25T22:01:25Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* Solution */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Ignore for now. Eventually fix by hand.&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
IAN FIXED IN ACCESS 3/25/2026&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO CHANGE THE SINGLE ENTRY TO EVL&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=544</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=544"/>
		<updated>2026-02-16T17:01:51Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#63) There are follow_arrival rows with NULL fa_data_source values */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Ignore for now. Eventually fix by hand.&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO CHANGE THE SINGLE ENTRY TO EVL&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
KARL TO FIX&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=543</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=543"/>
		<updated>2026-02-16T17:01:23Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* Solution */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Ignore for now. Eventually fix by hand.&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO CHANGE THE SINGLE ENTRY TO EVL&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=542</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=542"/>
		<updated>2026-02-16T16:59:59Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#61) Invalid follow date/focals pairs in follow_arrival */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Ignore for now. Eventually fix by hand.&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO CHANGE THE SINGLE ENTRY TO EVL&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
===SOLUTION===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=541</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=541"/>
		<updated>2026-02-16T16:58:41Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#60) Invalid biography_update_log.made_by values */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Ignore for now. Eventually fix by hand.&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO CHANGE THE SINGLE ENTRY TO EVL&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=540</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=540"/>
		<updated>2026-02-16T16:57:26Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#59) Zero BIOGRAPHY.b_animid_num values */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Ignore for now. Eventually fix by hand.&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=539</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=539"/>
		<updated>2026-02-16T16:56:45Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#58) OTHER_SPECIES duplicate keys */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Ignore for now. Eventually fix by hand.&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
IAN TO CHECK IN ACCESS&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
ERROR MESSAGE: Line 4: ERROR when executing SQL: column &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; does not exist&lt;br /&gt;
Hint: Perhaps you meant to reference the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot; or the column &amp;quot;OTHER_SPECIES.OS_FOL_B_focal_AnimID&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== * (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=538</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=538"/>
		<updated>2026-02-16T16:55:10Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#57) GROOM_BOUT duplicate keys */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Ignore for now. Eventually fix by hand.&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
&lt;br /&gt;
ICG TO LOOK AT SOURCE OF PROBLEM. MAYBE JUST DELETE DUPLICATES?&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=537</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=537"/>
		<updated>2026-02-16T16:52:47Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#55) There are community_membership rows that place an individual in a community before birth */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Ignore for now. Eventually fix by hand.&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
==(#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=536</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=536"/>
		<updated>2026-02-16T16:52:05Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not 0, U, or MISS */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Ignore for now. Eventually fix by hand.&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=535</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=535"/>
		<updated>2026-02-16T16:47:37Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* Solution */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Ignore for now. Eventually fix by hand.&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
IAN TO FIX&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=534</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=534"/>
		<updated>2026-02-16T16:44:38Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Ignore for now. Eventually fix by hand.&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== SOLUTION ===&lt;br /&gt;
IAN TO FIX IN ACCESS - CHECK DATES OF FOLLOWS AND ACCURACY OF IDS&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=533</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=533"/>
		<updated>2026-02-16T16:41:21Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#46) There are follow_arrivals where non-females have a cycle code that is other than n/a */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Ignore for now. Eventually fix by hand.&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
FIX THE FEW ACTUAL MALES THAT DON&amp;#039;T HAVE AN n/a&lt;br /&gt;
&lt;br /&gt;
Change non-females (INCLUDING UNKNOWN SEX) who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;. IAN TO FIX IN ACCESS&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=532</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=532"/>
		<updated>2026-02-16T16:36:10Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#45) The follow_arrival.fa_update column is not preserved */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Ignore for now. Eventually fix by hand.&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
FOLLOWUP WITH KARL. &lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Allow males to have a cycle code of &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change non-female&amp;#039;s who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=531</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=531"/>
		<updated>2026-02-16T16:33:46Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#44) There are follow arrivals where the arriving chimp arrives before being under study */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Ignore for now. Eventually fix by hand.&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Something is wrong with community entry date or biography&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Allow males to have a cycle code of &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change non-female&amp;#039;s who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=530</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=530"/>
		<updated>2026-02-16T16:28:54Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#43) There are follow arrivals with no related follow */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Ignore for now. Eventually fix by hand.&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Allow males to have a cycle code of &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change non-female&amp;#039;s who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=529</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=529"/>
		<updated>2026-02-16T16:25:53Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#41)There are follow arrivals with NULL nesting information */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==(#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Allow males to have a cycle code of &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change non-female&amp;#039;s who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=528</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=528"/>
		<updated>2026-02-16T16:25:10Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#35) There are follows done before a focal was under study */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Allow males to have a cycle code of &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change non-female&amp;#039;s who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=527</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=527"/>
		<updated>2026-02-16T16:24:38Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#34) There are duplicate animid, date combinations on FOLLOW */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== * (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Allow males to have a cycle code of &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change non-female&amp;#039;s who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=526</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=526"/>
		<updated>2026-02-16T16:24:05Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#33) Some follows have an animid that does not exist, even after animid cleanup */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
==(#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== * (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Allow males to have a cycle code of &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change non-female&amp;#039;s who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=525</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=525"/>
		<updated>2026-02-16T16:23:05Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#30) Some follows have no community */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
Solved&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Allow males to have a cycle code of &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change non-female&amp;#039;s who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=524</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=524"/>
		<updated>2026-02-16T16:22:11Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#27) Mismatch of end-in-nest on follow and follow_arrival */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
==(#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem. Default to Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Allow males to have a cycle code of &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change non-female&amp;#039;s who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=523</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=523"/>
		<updated>2026-02-16T16:21:39Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#26) Mismatch of start-in-nest on follow and follow_arrival */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore. Default to data from Follow_Arrival&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
== * (#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
If either table says the focal was in a nest, then have the focal be in a nest.&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Allow males to have a cycle code of &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change non-female&amp;#039;s who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=522</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=522"/>
		<updated>2026-02-16T16:17:17Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#25) Follow ends are not last arrivals */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
==(#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== * (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
If either table says the focal was in a nest, then have the focal be in a nest.&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
== * (#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
If either table says the focal was in a nest, then have the focal be in a nest.&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Allow males to have a cycle code of &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change non-female&amp;#039;s who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=521</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=521"/>
		<updated>2026-02-16T16:16:50Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#24) Follow starts are not first arrivals */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
== * (#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
TENTATIVE: This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== * (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
If either table says the focal was in a nest, then have the focal be in a nest.&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
== * (#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
If either table says the focal was in a nest, then have the focal be in a nest.&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Allow males to have a cycle code of &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change non-female&amp;#039;s who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=520</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=520"/>
		<updated>2026-02-16T16:07:45Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* Solution */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
When run in SokweDB, table does not exist&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== * (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
TENTATIVE: This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
== * (#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
TENTATIVE: This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== * (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;1,848&amp;lt;/s&amp;gt; 3,369 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
If either table says the focal was in a nest, then have the focal be in a nest.&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
== * (#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
If either table says the focal was in a nest, then have the focal be in a nest.&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 39 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of &amp;lt;s&amp;gt;9&amp;lt;/s&amp;gt; 23 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;217&amp;lt;/s&amp;gt; 49 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Add a temporal extension to Postgres to make the db into a temporal database to track change history and be able to see data as it existed at any point in time.&lt;br /&gt;
&lt;br /&gt;
Will re-review this should a temporal extension be out of budget, etc.&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;169&amp;lt;/s&amp;gt; 168 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Allow males to have a cycle code of &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change non-female&amp;#039;s who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
Resolution: Fixed in the data.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are &amp;lt;s&amp;gt;239&amp;lt;/s&amp;gt; 324 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;127&amp;lt;/s&amp;gt; 98 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;688&amp;lt;/s&amp;gt; 84 rows, having &amp;lt;s&amp;gt;3&amp;lt;/s&amp;gt; 6 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are &amp;lt;s&amp;gt;26&amp;lt;/s&amp;gt; 42 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#57) GROOM_BOUT duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;GROOM_BOUT&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;GRM_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_FOL_B_focal_AnimId&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_time_begin&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;GRM_B_partner_AnimId&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;GRM_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;GRM_time_begin&amp;quot; AS the_time&lt;br /&gt;
               , &amp;quot;GRM_B_partner_AnimId&amp;quot; AS the_partner&lt;br /&gt;
            FROM raw.&amp;quot;GROOM_BOUT&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;GRM_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_time_begin&amp;quot;&lt;br /&gt;
                   , &amp;quot;GRM_B_partner_AnimId&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS gb&lt;br /&gt;
      ON (&amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_date&amp;quot; = gb.the_date&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_FOL_B_focal_AnimId&amp;quot; = gb.the_animid&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_time_begin&amp;quot; = gb.the_time&lt;br /&gt;
          AND &amp;quot;GROOM_BOUT&amp;quot;.&amp;quot;GRM_B_partner_AnimId&amp;quot; = gb.the_partner);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#58) OTHER_SPECIES duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;OTHER_SPECIES&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;OS_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;OS_time_begin&amp;lt;/code&amp;gt;&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;OS_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot; AS the_animid&lt;br /&gt;
	       , &amp;quot;OS_time_begin&amp;quot; AS the_time&lt;br /&gt;
            FROM raw.&amp;quot;OTHER_SPECIES&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;OS_FOL_date&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_FOL_B_focal_AnimId&amp;quot;&lt;br /&gt;
                   , &amp;quot;OS_time_begin&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS os&lt;br /&gt;
      ON (&amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_date&amp;quot; = os.the_date&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_FOL_B_focal_AnimId&amp;quot; = os.the_animid&lt;br /&gt;
          AND &amp;quot;OTHER_SPECIES&amp;quot;.&amp;quot;OS_time_begin&amp;quot; = os.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Note ===&lt;br /&gt;
The reason the MS Access database seems to allow this condition is likely due to difference in character case between column names and primary key designations.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#59) Zero BIOGRAPHY.b_animid_num values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
SokewDB requires that the animal ID number be greater than &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;, but some (what seem to be rows for babys) have a &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;select * from clean.biography where b_animid_num = 0;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Change the &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; values to &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This is a brute-force, but adequate, solution because it does not validate anything concerning the rows affected.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#60) Invalid biography_update_log.made_by values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There is a &amp;lt;code&amp;gt;biography_update_log.made_by&amp;lt;/code&amp;gt; value (&amp;lt;code&amp;gt;SF/EVL&amp;lt;/code&amp;gt;) that is not a person.  (Not on the &amp;lt;code&amp;gt;PEOPLE&amp;lt;/code&amp;gt; table.)&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.biography_update_log&lt;br /&gt;
  WHERE made_by IS NOT NULL&lt;br /&gt;
        AND NOT EXISTS (SELECT 1&lt;br /&gt;
                          FROM clean.people&lt;br /&gt;
                          WHERE people.person = biography_update_log.made_by);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#61) Invalid follow date/focals pairs in follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are &amp;lt;code&amp;gt;follow_arrival.fa_fol_b_animid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;follow_arrival.fa_fol_date&amp;lt;/code&amp;gt; value combinations that do not exist in &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The invalid values can be listed (from the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
          (SELECT 1&lt;br /&gt;
             FROM clean.follow&lt;br /&gt;
             WHERE follow.fol_date = follow_arrival.fa_fol_date&lt;br /&gt;
                   AND follow.fol_b_animid&lt;br /&gt;
                   = follow_arrival.fa_fol_b_focal_animid)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== (#62) There are follow_arrival rows with arriving animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 39 follow arrivals where the arriving animid has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from easy.follow_arrival where fa_b_arr_animid &amp;lt;&amp;gt; rtrim(fa_b_arr_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
== * (#63) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_data_source values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 5,020 follow arrivals where the fa_data_source is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_data_source IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;lt;code&amp;gt;none&amp;lt;/code&amp;gt; value in ARRIVAL_SOURCES, and use that value instead of &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; in follow_arrival table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#64) There are follow_arrival rows with &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; fa_type_of_certainty values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follow arrivals where the fa_type_of_certainty is &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM easy.follow_arrival&lt;br /&gt;
  WHERE fa_type_of_certainty IS NULL&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=472</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=472"/>
		<updated>2025-10-31T16:45:40Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* Bad solution */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== * (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
TENTATIVE: This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
== * (#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
TENTATIVE: This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== * (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 1,848 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
If either table says the focal was in a nest, then have the focal be in a nest.&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
== * (#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
If either table says the focal was in a nest, then have the focal be in a nest.&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of 9 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 217 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 169 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Allow males to have a cycle code of &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change non-female&amp;#039;s who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 239 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 127 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 688 rows, having only 3 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 26 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN community starte date in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=471</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=471"/>
		<updated>2025-10-31T16:45:19Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#55) There are community_membership rows that place an individual in a community before birth */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== * (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
TENTATIVE: This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
== * (#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
TENTATIVE: This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== * (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 1,848 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
If either table says the focal was in a nest, then have the focal be in a nest.&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
When applying the solution, solving both problem #26 and #27,&lt;br /&gt;
and updating follow_arrival so that it is the one source of truth&lt;br /&gt;
used by the conversion,&lt;br /&gt;
there are 297 follow_arrival rows updated.&lt;br /&gt;
&lt;br /&gt;
This implies that it is primarily the follow table&lt;br /&gt;
that does not mark the individual as being in a&lt;br /&gt;
nest when they should be in a nest.  (&amp;quot;Should be&amp;quot;,&lt;br /&gt;
according to the accepted solution.)&lt;br /&gt;
&lt;br /&gt;
== * (#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
If either table says the focal was in a nest, then have the focal be in a nest.&lt;br /&gt;
&lt;br /&gt;
==== Remarks ====&lt;br /&gt;
&lt;br /&gt;
See the remarks for [[Conversion Data Issues#(#27) Mismatch of end-in-nest on follow and follow_arrival|problem #26]].&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of 9 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 217 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 169 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Allow males to have a cycle code of &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change non-female&amp;#039;s who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 239 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 127 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 688 rows, having only 3 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 26 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed FN birthdate in MS Access - oct 2025&lt;br /&gt;
&lt;br /&gt;
== (#56) There are follow_arrival focal animids with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 144 follow arrivals where the focal id has trailing spaces.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from follow_arrival where fa_fol_b_focal_animid &amp;lt;&amp;gt; rtrim(fa_fol_b_focal_animid);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in table in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=465</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=465"/>
		<updated>2025-10-23T20:27:11Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* Solution */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== * (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
TENTATIVE: This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
== * (#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
TENTATIVE: This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== * (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 1,848 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
If either table says the focal was in a nest, then have the focal be in a nest.&lt;br /&gt;
&lt;br /&gt;
== * (#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM easy.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM easy.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN easy.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
If either table says the focal was in a nest, then have the focal be in a nest.&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of 9 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 217 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 169 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Allow males to have a cycle code of &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change non-female&amp;#039;s who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 239 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 127 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 688 rows, having only 3 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in MS Access 10/2025. AMA--&amp;gt;AME, OBE--&amp;gt;POR&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 26 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=457</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=457"/>
		<updated>2025-10-22T22:37:56Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#35) There are follows done before a focal was under study */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== * (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
TENTATIVE: This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
== * (#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
TENTATIVE: This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== * (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 1,848 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM clean.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN clean.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM clean.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN clean.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of 9 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 217 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 169 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Allow males to have a cycle code of &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change non-female&amp;#039;s who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 239 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 127 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 688 rows, having only 3 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fix in the MS Access data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 26 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=456</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=456"/>
		<updated>2025-10-22T22:36:16Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#41)There are follow arrivals with NULL nesting information */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== * (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
TENTATIVE: This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
== * (#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
TENTATIVE: This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== * (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 1,848 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM clean.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN clean.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM clean.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN clean.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of 9 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 217 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 169 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Allow males to have a cycle code of &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change non-female&amp;#039;s who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 239 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 127 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 688 rows, having only 3 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fix in the MS Access data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 26 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=455</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=455"/>
		<updated>2025-10-22T22:35:14Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#34) There are duplicate animid, date combinations on FOLLOW */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== * (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
TENTATIVE: This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
== * (#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
TENTATIVE: This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== * (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 1,848 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM clean.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN clean.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM clean.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN clean.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of 9 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 217 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 169 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Allow males to have a cycle code of &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change non-female&amp;#039;s who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 239 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 127 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 688 rows, having only 3 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fix in the MS Access data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 26 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=454</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=454"/>
		<updated>2025-10-22T22:34:01Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#33) Some follows have an animid that does not exist, even after animid cleanup */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== * (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
TENTATIVE: This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
== * (#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
TENTATIVE: This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== * (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 1,848 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM clean.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN clean.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM clean.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN clean.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of 9 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== * (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 217 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 169 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Allow males to have a cycle code of &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change non-female&amp;#039;s who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 239 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 127 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 688 rows, having only 3 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fix in the MS Access data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 26 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=453</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=453"/>
		<updated>2025-10-22T22:33:11Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#30) Some follows have no community */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
No longer a problem as NULLs are now allowed.&lt;br /&gt;
&lt;br /&gt;
=== (Not) Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has 2 potential dads.  Introduce a DadIDStatus column, to replace the DadIDPrelim column and have a code that describes what&amp;#039;s going on with &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Fixed in the MS Access data; &amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; was assigned the &amp;lt;code&amp;gt;UNK&amp;lt;/code&amp;gt; individual as the dad.  Problem will be marked resolved with the upload of the next MS Access database dump.&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 5d85b83bfe1a.&lt;br /&gt;
&lt;br /&gt;
== * (#13) COMM_MEMBS rows place individuals in a community, that is not their birth community, before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
13 COMM_MEMBS rows place individuals into a community, that is not their birth community, before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthgroup, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where b.b_birthgroup is distinct from cm.cm_cl_community_id&lt;br /&gt;
        and cm.cm_start_date &amp;lt; b.b_entrydate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
3 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate&lt;br /&gt;
  order by b.b_animid, cm.cm_end_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
See problem #18.  Fixed in commit 7a48fb2c4aad3b5bd4&lt;br /&gt;
&lt;br /&gt;
== (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit 4c9626b304.&lt;br /&gt;
&lt;br /&gt;
== (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
Fixed in commit  7a48fb2c4aad3b5b.&lt;br /&gt;
&lt;br /&gt;
== (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
Fixed in commit 8a528cc7d1a9fb.&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Make this a &amp;quot;soft&amp;quot; error.  Fixed in commit c96555f9f326.&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;br /&gt;
&lt;br /&gt;
== * (#24) Follow starts are not first arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,397 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_begin&amp;quot; is not the first arrival time, fa_time_start,&lt;br /&gt;
on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow start is supposed&lt;br /&gt;
to be the first arrival.  Is there additional data&lt;br /&gt;
in the old fol_time_begin, like the actual start&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.min_start = follow.fol_time_begin)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
TENTATIVE: This is data entry error, ignore the problem and ignore the data in the follow.fol_time_begin.&lt;br /&gt;
&lt;br /&gt;
== * (#25) Follow ends are not last arrivals ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 6,570 cases where the old FOLLOW table&amp;#039;s columns &lt;br /&gt;
&amp;quot;fol_time_end&amp;quot; is not the last arrival&lt;br /&gt;
time, fa_time_end, on FOLLOW_ARRIVAL.&lt;br /&gt;
&lt;br /&gt;
The The Gombe Chimpanzee Database Handbook says that this&lt;br /&gt;
should never happen, because the follow end is supposed&lt;br /&gt;
to be the first arrival/last departure.  Is there additional data&lt;br /&gt;
in the old fol_time_end, like the actual end-time&lt;br /&gt;
the observers started working?  If not, which data is correct,&lt;br /&gt;
the one in FOLLOW or the one in FOLLOW_ARRIVAL?&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
          , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_end&lt;br /&gt;
  FROM spans&lt;br /&gt;
    JOIN clean.follow&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
        (SELECT 1&lt;br /&gt;
           FROM clean.follow&lt;br /&gt;
           WHERE follow.fol_date = spans.fa_fol_date&lt;br /&gt;
                 AND follow.fol_b_animid = spans.fa_fol_b_focal_animid&lt;br /&gt;
                 AND spans.max_end = follow.fol_time_end)&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
TENTATIVE: This is data entry error, ignore the problem and ignore the data in the follow.fol_time_end.&lt;br /&gt;
&lt;br /&gt;
== * (#26) Mismatch of start-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 1,848 follows where the follow arrival says the focal started in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MIN(fa_time_start) AS min_start&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*, follow.fol_time_begin, follow.fol_flag_begin_in_nest&lt;br /&gt;
     , first_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM clean.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN clean.follow_arrival&lt;br /&gt;
      AS first_arrivals&lt;br /&gt;
      ON (first_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND first_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND first_arrivals.fa_time_start = spans.min_start)&lt;br /&gt;
  WHERE (((first_arrivals.fa_type_of_nesting = 1&lt;br /&gt;
           OR first_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_begin_in_nest = 0)&lt;br /&gt;
         OR (first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 1&lt;br /&gt;
             AND first_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_begin_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#27) Mismatch of end-in-nest on follow and follow_arrival ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3,804 follows where the follow arrival says the focal ended in the nest but the follow does not, or vice-versa.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH spans AS&lt;br /&gt;
  (SELECT fa_fol_date, fa_fol_b_focal_animid&lt;br /&gt;
        , MAX(fa_time_end) AS max_end&lt;br /&gt;
     FROM clean.follow_arrival&lt;br /&gt;
     WHERE fa_b_arr_animid = fa_fol_b_focal_animid&lt;br /&gt;
     GROUP BY fa_fol_date, fa_fol_b_focal_animid)&lt;br /&gt;
SELECT spans.*&lt;br /&gt;
     , follow.fol_time_end, follow.fol_flag_end_in_nest&lt;br /&gt;
     , last_arrivals.fa_type_of_nesting&lt;br /&gt;
  FROM clean.follow&lt;br /&gt;
    JOIN spans&lt;br /&gt;
      ON (follow.fol_date = spans.fa_fol_date&lt;br /&gt;
          AND follow.fol_b_animid = spans.fa_fol_b_focal_animid)&lt;br /&gt;
    JOIN clean.follow_arrival&lt;br /&gt;
      AS last_arrivals&lt;br /&gt;
      ON (last_arrivals.fa_fol_date = follow.fol_date&lt;br /&gt;
          AND last_arrivals.fa_fol_b_focal_animid = follow.fol_b_animid&lt;br /&gt;
          AND last_arrivals.fa_time_end = spans.max_end)&lt;br /&gt;
  WHERE (((last_arrivals.fa_type_of_nesting = 2&lt;br /&gt;
           OR last_arrivals.fa_type_of_nesting = 3)&lt;br /&gt;
          AND follow.fol_flag_end_in_nest = 0)&lt;br /&gt;
         OR (last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 2&lt;br /&gt;
             AND last_arrivals.fa_type_of_nesting &amp;lt;&amp;gt; 3&lt;br /&gt;
             AND follow.fol_flag_end_in_nest = 1))&lt;br /&gt;
  ORDER BY spans.fa_fol_date, spans.fa_fol_b_focal_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#28) The FOLLOW.FOL_distance_traveled column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.FOL_distance_traveled column has no corresponding column in the new&lt;br /&gt;
database design.  The conversion process does not check that the value&lt;br /&gt;
of this column is consistent with the other data in the database from which it is computed.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This is a computed column and does not need to be converted.  The desired value is computed in the design of the new database.&lt;br /&gt;
&lt;br /&gt;
The assumption is that the &amp;quot;raw&amp;quot; data from which this value is computed in the MS Access database is correct.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#29) The FOLLOW.Brecord_notes column is not converted ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The FOLLOW.Brecord_notes column has no corresponding column in the new&lt;br /&gt;
database design.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
This column is used for administrative purposes and does not need to be in the new database design.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#30) Some follows have no community ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 13 follows with no community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where fol_cl_community_id is null&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== solution ===&lt;br /&gt;
&lt;br /&gt;
ICG fixed in Access 10/2025&lt;br /&gt;
&lt;br /&gt;
== (#31) Some follows have an animid with trailing spaces ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 follows with animids that don&amp;#039;t exist, because they have trailing spaces.  These are comprised of 9 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select rtrim(fol_b_animid)&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where rtrim(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Remove the trailing spaces in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== (#32) Some follows have an animid that is lower-case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is 1 follows with an animids that don&amp;#039;t exist, because it is lower-case.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
Cleaned up in &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema, so query &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.follow&lt;br /&gt;
  where upper(fol_b_animid) &amp;lt;&amp;gt; fol_b_animid&lt;br /&gt;
  order by fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Convert the animid to upper-case in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#33) Some follows have an animid that does not exist, even after animid cleanup ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 14 follows with animids that don&amp;#039;t exist, even after cleanup that removes trailing spaces and forces upper-case.  (Some of these may be due to prior errors.)  These are comprised of 3 distinct animids.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 14 follows with bad animids&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The 3 animids involved&lt;br /&gt;
select distinct follow.fol_b_animid&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where not exists&lt;br /&gt;
          (select 1&lt;br /&gt;
             from clean.biography&lt;br /&gt;
             where biography.b_animid = upper(rtrim(follow.fol_b_animid)))&lt;br /&gt;
  order by follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#34) There are duplicate animid, date combinations on FOLLOW ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 2 sets of duplicate animid, date combinations on the FOLLOW table, for a total of 4 rows.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate animid, date combinations&lt;br /&gt;
select follow.fol_b_animid, follow.fol_date, count(*)&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
  having count(*) &amp;gt; 1&lt;br /&gt;
  order by fol_b_animid, fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The duplicate rows&lt;br /&gt;
with dups as (&lt;br /&gt;
  select follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    from clean.follow&lt;br /&gt;
    group by follow.fol_b_animid, follow.fol_date&lt;br /&gt;
    having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
       join dups on (follow.fol_b_animid = dups.fol_b_animid&lt;br /&gt;
                     and follow.fol_date = dups.fol_date)&lt;br /&gt;
  order by follow.fol_b_animid, follow.fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== * (#35) There are follows done before a focal was under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
There are 5 follows that are done before their focal was under study, before the focal&amp;#039;s EntryDate.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow.*, biography.b_entrydate&lt;br /&gt;
  from clean.follow&lt;br /&gt;
    join clean.biography on (biography.b_animid = follow.fol_b_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow.fol_date&lt;br /&gt;
  order by follow.fol_date, follow.fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#36) FOLLOW_OBSERVERS.Period is not checked against follow start or stop times ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The conversion process uses the clean.follow columns of fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, and fol_pm_observer_2 to populate FOLLOW_OBSERVERS.&lt;br /&gt;
(The *_observer_1 column going into FOLLOW_OBSERVERS.OBS_BRec and the *_observer_2 column going in OBS_Tiki.)&lt;br /&gt;
&lt;br /&gt;
If both the *_observer_1 and the *_observer_2 columns are either NULL or the empty string (after space trimming), the no row is created for the respective time period.&lt;br /&gt;
&lt;br /&gt;
There are no checks done to ensure that the time periods of the follow have any relation to the FOLLOW_OBSERVERS.Period value.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  (If someone cares, add a query to the warning system.)&lt;br /&gt;
&lt;br /&gt;
== (#37) Some follows have no recorded observers ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 follows with no recorded observers, but the system requires there be a FOLLOW_OBSERVERS record related to the follow.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow&lt;br /&gt;
  where coalesce(btrim(fol_am_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_am_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_1), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
        and  coalesce(btrim(fol_pm_observer_2), &amp;#039;&amp;#039;) = &amp;#039;&amp;#039;&lt;br /&gt;
  order by fol_date, fol_b_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make the &amp;quot;NONE&amp;quot; (no observer) person the observers.&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;UNK&amp;quot; (unknown) time period in PERIODS, and make that the time period.&lt;br /&gt;
&lt;br /&gt;
== (#38) Some follows have only one observer, but the system wants two ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are follows with only one observer, but the system requires a FOLLOW_OBSERVERS record have a value in both OBS_BRec and OBS_Tiki.&lt;br /&gt;
&lt;br /&gt;
See problem #36 for a description of the follow observer conversion process.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create a &amp;quot;NONE&amp;quot; person, and make that person the observer when there&amp;#039;s otherwise not a value.&lt;br /&gt;
&lt;br /&gt;
== (#39) In follow, there are observers that are 2 people ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 30 rows (28, really, because of n/a) that appear to represent 2 people.&lt;br /&gt;
&lt;br /&gt;
These look like names, separated by the &amp;quot;/&amp;quot; character.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM clean.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM clean.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
          AND STRPOS(uniq_people.person, &amp;#039;/&amp;#039;) &amp;lt;&amp;gt; 0&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Ignore the problem.  Mark all people containing a &amp;quot;/&amp;quot; character as not active, so they can&amp;#039;t be used in the future.&lt;br /&gt;
&lt;br /&gt;
== (#40) In follow, there are observers that differ only by character case ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow table, in columns fol_am_observer_1, fol_am_observer_2, fol_pm_observer_1, fol_pm_observer_2, there are 511 names that differ only by character case.  So, about half that in terms of unique names.&lt;br /&gt;
&lt;br /&gt;
This is complicated to account for and exclude duplicates in the conversion process.  So resolution of this is holding back additional conversion work.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note the use of the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, instead of the&lt;br /&gt;
&amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.  This is because the observer data&lt;br /&gt;
in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema has been case-normalized.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
WITH new_people AS (&lt;br /&gt;
  SELECT BTRIM(follow.fol_am_observer_1) AS person&lt;br /&gt;
    FROM easy.follow&lt;br /&gt;
    WHERE follow.fol_am_observer_1 IS NOT NULL&lt;br /&gt;
    GROUP BY BTRIM(follow.fol_am_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_am_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_am_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_am_observer_2)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_1) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_1 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_1)&lt;br /&gt;
  UNION&lt;br /&gt;
    SELECT BTRIM(follow.fol_pm_observer_2) AS person&lt;br /&gt;
      FROM easy.follow&lt;br /&gt;
      WHERE follow.fol_pm_observer_2 IS NOT NULL&lt;br /&gt;
      GROUP BY BTRIM(follow.fol_pm_observer_2)&lt;br /&gt;
)&lt;br /&gt;
, uniq_people AS (&lt;br /&gt;
  SELECT new_people.person&lt;br /&gt;
    FROM new_people&lt;br /&gt;
    WHERE new_people.person &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
    GROUP BY new_people.person&lt;br /&gt;
  )&lt;br /&gt;
  SELECT uniq_people.person&lt;br /&gt;
    FROM uniq_people&lt;br /&gt;
         JOIN uniq_people AS up&lt;br /&gt;
              ON (LOWER(uniq_people.person) = LOWER(up.person))&lt;br /&gt;
    WHERE uniq_people.person &amp;lt;&amp;gt; up.person&lt;br /&gt;
    GROUP BY uniq_people.person&lt;br /&gt;
    ORDER BY LOWER(uniq_people.person), uniq_people.person;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Use the mixed case code when such exists.  This involves changing the observer columns in the &amp;lt;code&amp;gt;follow&amp;lt;/code&amp;gt; table.  See [[The_Old_Database#The_observer_columns_of_the_follow_table|the notes]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#41)There are follow arrivals with NULL nesting information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 11 rows with a NULL fa_type_of_nesting.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_nesting IS NULL;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#42) There are follow arrivals with NULL sexual cycle information ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 222 rows with a NULL fa_type_of_cycle.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.follow_arrival where fa_type_of_cycle is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Make a &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; &amp;lt;code&amp;gt;CYCLE_STATES&amp;lt;/code&amp;gt; value and use that for the arrivals with no cycle information.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#43) There are follow arrivals with no related follow ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 732 rows with a focal and a date that have no matching information on the follow table.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT *&lt;br /&gt;
  FROM clean.follow_arrival&lt;br /&gt;
  WHERE NOT EXISTS&lt;br /&gt;
    (SELECT 1&lt;br /&gt;
       FROM clean.follow&lt;br /&gt;
       WHERE follow.fol_b_animid = follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
             AND follow.fol_date = follow_arrival.fa_fol_date)&lt;br /&gt;
  ORDER BY follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#44) There are follow arrivals where the arriving chimp arrives before being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 217 rows where the fa_b_arr_animid, the arriving individual, has an entry date after the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_entrydate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_entrydate &amp;gt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#45) The follow_arrival.fa_update column is not preserved ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is nowhere in the current design to store the values in the follow_arrival.fa_update column.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#46) There are follow_arrivals where non-females have a cycle code that is other than &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 169 follow_arrival rows, for non-female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) that is not &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To summarize by sex and cycle code:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select biography.b_sex, follow_arrival.fa_type_of_cycle, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex &amp;lt;&amp;gt; &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;n/a&amp;#039;&lt;br /&gt;
  group by biography.b_sex, follow_arrival.fa_type_of_cycle order by biography.b_sex&lt;br /&gt;
         , follow_arrival.fa_type_of_cycle;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Allow males to have a cycle code of &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change non-female&amp;#039;s who have a cycle code of &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt; to a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#47) There are follow_arrivals where females that are too young have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 16 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are less than 5 years of age.&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;5 years&amp;#039;::INTERVAL&lt;br /&gt;
                )&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note: The original query used the wrong inequality symbol.&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changing the limit from 6 years of age to 5 reduced the number of outstanding problems to 16.&lt;br /&gt;
&lt;br /&gt;
The rest will have to be fixed in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 5 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== * (#48) There are follow arrivals where the arriving chimp arrives after finishing being under study ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
In the follow_arrival table, there are 239 rows where the fa_b_arr_animid, the arriving individual, has a departure date before the date of the follow.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_departdate, biography.b_sex&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
         on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_departdate &amp;lt; follow_arrival.fa_fol_date&lt;br /&gt;
  order by biography.b_animid, follow_arrival.fa_fol_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#49) There are follow_arrivals where females that are too old have a cycle code of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 127 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt; but are more than 14 years of age.&lt;br /&gt;
(Actually, because the endpoint takes up the whole 14th year, this means at least 15 years of age.)&lt;br /&gt;
&lt;br /&gt;
Naturally, this number will change if the age limit is changed, but this is here as a placeholder.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the maximum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Note: Original query used the wrong inequality.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_type_of_cycle = &amp;#039;U&amp;#039;&lt;br /&gt;
	and biography.b_birthdate&lt;br /&gt;
              &amp;lt;= (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;1 year&amp;#039;::INTERVAL&lt;br /&gt;
                 - &amp;#039;14 years&amp;#039;::INTERVAL&lt;br /&gt;
		)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
The number of problem rows was reduced after changing the limit to 14 years from 9 years.&lt;br /&gt;
&lt;br /&gt;
The remaining problems will have to be adjusted in the data.  Alternately, we can adjust the hard limit, and set a soft limit of 14 years in the warning system with a note to change the hard limit back once the errors are resolved.&lt;br /&gt;
&lt;br /&gt;
== (#50) There are follow_arrivals where females have a cycle code of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4,496 follow_arrival rows, for female arriving individuals, that have a sexual cycle code (fa_type_of_cycle) of &amp;lt;code&amp;gt;n/a&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
     , biography.b_sex&lt;br /&gt;
     , biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle = &amp;#039;n/a&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change the cycle code to &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt;, for these rows.  This is done in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&lt;br /&gt;
This is an indication that future cleanup is required.&lt;br /&gt;
&lt;br /&gt;
== (#51) There are follow_arrivals with invalid fa_data_source values  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 546 follow_arrival rows that have fa_data_source values that are not one of:&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_Mom&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Tiki_ID&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
Query the &amp;lt;code&amp;gt;easy&amp;lt;/code&amp;gt; schema, because the data has been cleaned in the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&lt;br /&gt;
-- Summarize with:&lt;br /&gt;
select follow_arrival.fa_data_source, count(*)&lt;br /&gt;
  from easy.follow_arrival&lt;br /&gt;
  where follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_Mom&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Tiki_ID&amp;#039;&lt;br /&gt;
	and follow_arrival.fa_data_source &amp;lt;&amp;gt; &amp;#039;Brec&amp;#039;&lt;br /&gt;
  group by follow_arrival.fa_data_source&lt;br /&gt;
  order by follow_arrival.fa_data_source;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;BREC&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;brec&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Brec&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;code&amp;gt;TikI&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;Tiki&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Add the other codes:&lt;br /&gt;
&lt;br /&gt;
  fa_data_source	count&lt;br /&gt;
  Tiki_GM	202&lt;br /&gt;
  Tiki_PM	165&lt;br /&gt;
  Tiki_SS	22&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Also: Tiki_Mom&lt;br /&gt;
&lt;br /&gt;
== * (#52) There are follow_arrivals that are almost duplicates  ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
This entry may end up being multiple problems.&lt;br /&gt;
&lt;br /&gt;
There are follow_arrivals that are near duplicates.&lt;br /&gt;
When checking for duplicates on &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_seq_num&amp;lt;/code&amp;gt;, the rows are always unique.&lt;br /&gt;
But checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt;,&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; yields 184 rows.&lt;br /&gt;
Leaving off &amp;lt;code&amp;gt;fa_time_end&amp;lt;/code&amp;gt; and just checking the combination of &amp;lt;code&amp;gt;fa_fol_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_fol_b_focal_animid&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt;&lt;br /&gt;
and &amp;lt;code&amp;gt;fa_time_start&amp;lt;/code&amp;gt; yields 430 rows.&lt;br /&gt;
&lt;br /&gt;
Why the duplicates?&lt;br /&gt;
&lt;br /&gt;
This entry is a call for a definition of an &amp;lt;code&amp;gt;ARRIVALS&amp;lt;/code&amp;gt; row, what does it mean to be a duplicate?&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- no duplicates when looking at just sequence number&lt;br /&gt;
select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_seq_num&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_seq_num&lt;br /&gt;
     having count(*) &amp;gt; 1;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- Checking against start and end time&lt;br /&gt;
with dups as&lt;br /&gt;
  (select fa_fol_date&lt;br /&gt;
        , fa_fol_b_focal_animid&lt;br /&gt;
        , fa_b_arr_animid&lt;br /&gt;
        , fa_time_start&lt;br /&gt;
        , fa_time_end&lt;br /&gt;
     from clean.follow_arrival&lt;br /&gt;
     group by fa_fol_date&lt;br /&gt;
            , fa_fol_b_focal_animid&lt;br /&gt;
            , fa_b_arr_animid&lt;br /&gt;
            , fa_time_start&lt;br /&gt;
            , fa_time_end&lt;br /&gt;
     having count(*) &amp;gt; 1)&lt;br /&gt;
select *&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where exists&lt;br /&gt;
    (select 1                                             &lt;br /&gt;
       from dups                                     &lt;br /&gt;
       where follow_arrival.fa_fol_date = dups.fa_fol_date&lt;br /&gt;
             and follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
                 = dups.fa_fol_b_focal_animid&lt;br /&gt;
             and follow_arrival.fa_b_arr_animid = dups.fa_b_arr_animid&lt;br /&gt;
             and follow_arrival.fa_time_start = dups.fa_time_start&lt;br /&gt;
             and follow_arrival.fa_time_end = dups.fa_time_end)&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_time_start&lt;br /&gt;
         , follow_arrival.fa_time_end&lt;br /&gt;
         , follow_arrival.fa_seq_num;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Where both start and end-times are duplicated, along with other data like cycle state, I can add additional observers or data sources.  But I don&amp;#039;t know what to do with &amp;quot;almost the same&amp;quot; data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#53) There are follow_arrivals where the arriving chimp does not exist ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 688 rows, having only 3 different &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; values, where the &amp;lt;code&amp;gt;fa_b_arr_animid&amp;lt;/code&amp;gt; value is not a &amp;lt;code&amp;gt;biography.b_animid&amp;lt;/code&amp;gt; value.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-- The rows&lt;br /&gt;
select *                                       &lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid&lt;br /&gt;
         , follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid;&lt;br /&gt;
&lt;br /&gt;
-- A summary of the bad animal ids&lt;br /&gt;
select follow_arrival.fa_b_arr_animid, count(*)&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
  where not exists&lt;br /&gt;
    (select 1&lt;br /&gt;
       from clean.biography&lt;br /&gt;
       where biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  group by follow_arrival.fa_b_arr_animid&lt;br /&gt;
  order by follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fix in the MS Access data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#54) There are follow_arrivals where females that are too young have a cycle state code that is not &amp;lt;code&amp;gt;0&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;U&amp;lt;/code&amp;gt;, or &amp;lt;code&amp;gt;MISS&amp;lt;/code&amp;gt; ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 26 follow_arrival rows, for female arriving individuals, that have a sexual cycle code indicating sexual swelling but are less than 8 years of age.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select follow_arrival.*, biography.b_sex, biography.b_birthdate&lt;br /&gt;
  from clean.follow_arrival&lt;br /&gt;
    join clean.biography&lt;br /&gt;
           on (biography.b_animid = follow_arrival.fa_b_arr_animid)&lt;br /&gt;
  where biography.b_sex = &amp;#039;F&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;0&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;U&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_type_of_cycle &amp;lt;&amp;gt; &amp;#039;MISS&amp;#039;&lt;br /&gt;
        and biography.b_birthdate&lt;br /&gt;
              &amp;gt; (follow_arrival.fa_fol_date&lt;br /&gt;
                 - &amp;#039;8 years&amp;#039;::INTERVAL&lt;br /&gt;
                 )&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF2&amp;#039;&lt;br /&gt;
        and follow_arrival.fa_b_arr_animid &amp;lt;&amp;gt; &amp;#039;MGF3&amp;#039;&lt;br /&gt;
  order by follow_arrival.fa_fol_date&lt;br /&gt;
         , follow_arrival.fa_fol_b_focal_animid&lt;br /&gt;
         , follow_arrival.fa_b_arr_animid;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#55) There are community_membership rows that place an individual in a community before birth ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There is one individual who is placed in a community, once, before birth.&lt;br /&gt;
&lt;br /&gt;
Note: The test is against the birthdate, not the minimum possible birthdate.&lt;br /&gt;
&lt;br /&gt;
=== Bad data ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_birthdate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_birthdate&lt;br /&gt;
  order by b.b_animid, cm.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=338</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=338"/>
		<updated>2024-02-15T19:21:52Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#13) 234 COMM_MEMBS rows place individuals in a community before their BIOGRAPHY.StartDate */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
== * (#13) 234 COMM_MEMBS rows place individuals in a community before their BIOGRAPHY.EntryDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
234 COMM_MEMBS rows place individuals into a community before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_entrydate;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) 6 COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
6 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_departdate, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== * (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Have the conversion program change all the NULL values to &amp;lt;nowiki&amp;gt;&amp;#039;&amp;#039;&amp;lt;/nowiki&amp;gt;, the empty string.&lt;br /&gt;
&lt;br /&gt;
== * (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
need to talk to Karl&lt;br /&gt;
&lt;br /&gt;
== * (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== * (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Talk to Karl&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ==&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=337</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=337"/>
		<updated>2024-02-15T19:14:13Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Added KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
codes are created during conversion&lt;br /&gt;
&lt;br /&gt;
== * (#13) 234 COMM_MEMBS rows place individuals in a community before their BIOGRAPHY.StartDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
234 COMM_MEMBS rows place individuals into a community before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_entrydate;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) 6 COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
6 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_departdate, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== * (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Have the conversion program change all the NULL values to &amp;lt;nowiki&amp;gt;&amp;#039;&amp;#039;&amp;lt;/nowiki&amp;gt;, the empty string.&lt;br /&gt;
&lt;br /&gt;
== * (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
need to talk to Karl&lt;br /&gt;
&lt;br /&gt;
== * (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== * (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Talk to Karl&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ==&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=334</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=334"/>
		<updated>2024-02-08T20:59:43Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Add KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
&lt;br /&gt;
== * (#13) 234 COMM_MEMBS rows place individuals in a community before their BIOGRAPHY.StartDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
234 COMM_MEMBS rows place individuals into a community before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_entrydate;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) 6 COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
6 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_departdate, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== * (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Have the conversion program change all the NULL values to &amp;lt;nowiki&amp;gt;&amp;#039;&amp;#039;&amp;lt;/nowiki&amp;gt;, the empty string.&lt;br /&gt;
&lt;br /&gt;
== * (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
need to talk to Karl&lt;br /&gt;
&lt;br /&gt;
== * (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== * (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Talk to Karl&lt;br /&gt;
&lt;br /&gt;
== (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ==&lt;br /&gt;
Fixed in MS Access by ICG 2/8/2024. Changed multiple IDs to the one who made the final change.&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=333</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=333"/>
		<updated>2024-02-08T20:55:09Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* Solution */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Add KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
&lt;br /&gt;
== * (#13) 234 COMM_MEMBS rows place individuals in a community before their BIOGRAPHY.StartDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
234 COMM_MEMBS rows place individuals into a community before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_entrydate;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) 6 COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
6 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_departdate, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== * (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Have the conversion program change all the NULL values to &amp;lt;nowiki&amp;gt;&amp;#039;&amp;#039;&amp;lt;/nowiki&amp;gt;, the empty string.&lt;br /&gt;
&lt;br /&gt;
== * (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
need to talk to Karl&lt;br /&gt;
&lt;br /&gt;
== * (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== * (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Talk to Karl&lt;br /&gt;
&lt;br /&gt;
== * (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=332</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=332"/>
		<updated>2024-02-08T20:54:48Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#21) There are 6 rows in BIOGRAPHY_LOG where the Description is NULL */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Add KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
&lt;br /&gt;
== * (#13) 234 COMM_MEMBS rows place individuals in a community before their BIOGRAPHY.StartDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
234 COMM_MEMBS rows place individuals into a community before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_entrydate;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) 6 COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
6 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_departdate, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== * (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Have the conversion program change all the NULL values to &amp;lt;nowiki&amp;gt;&amp;#039;&amp;#039;&amp;lt;/nowiki&amp;gt;, the empty string.&lt;br /&gt;
&lt;br /&gt;
== * (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
need to talk to Karl&lt;br /&gt;
&lt;br /&gt;
== * (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== (#21) There are 6 rows in BIOGRAPHY_LOG where the update_escription is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024, using information in update_rationale&lt;br /&gt;
&lt;br /&gt;
== * (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=331</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=331"/>
		<updated>2024-02-08T20:50:34Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Add KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
&lt;br /&gt;
== * (#13) 234 COMM_MEMBS rows place individuals in a community before their BIOGRAPHY.StartDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
234 COMM_MEMBS rows place individuals into a community before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_entrydate;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) 6 COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
6 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_departdate, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== * (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Have the conversion program change all the NULL values to &amp;lt;nowiki&amp;gt;&amp;#039;&amp;#039;&amp;lt;/nowiki&amp;gt;, the empty string.&lt;br /&gt;
&lt;br /&gt;
== * (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
need to talk to Karl&lt;br /&gt;
&lt;br /&gt;
== * (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG fixed in MS Access 2/8/2024. Updated rationale to &amp;#039;routine update&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== * (#21) There are 6 rows in BIOGRAPHY_LOG where the Description is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=330</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=330"/>
		<updated>2024-02-08T20:47:03Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* Solution */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Add KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
&lt;br /&gt;
== * (#13) 234 COMM_MEMBS rows place individuals in a community before their BIOGRAPHY.StartDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
234 COMM_MEMBS rows place individuals into a community before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_entrydate;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) 6 COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
6 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_departdate, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== * (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Have the conversion program change all the NULL values to &amp;lt;nowiki&amp;gt;&amp;#039;&amp;#039;&amp;lt;/nowiki&amp;gt;, the empty string.&lt;br /&gt;
&lt;br /&gt;
== * (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
need to talk to Karl&lt;br /&gt;
&lt;br /&gt;
== * (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#21) There are 6 rows in BIOGRAPHY_LOG where the Description is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=329</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=329"/>
		<updated>2024-02-08T20:41:19Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* Solution */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Add KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
&lt;br /&gt;
== * (#13) 234 COMM_MEMBS rows place individuals in a community before their BIOGRAPHY.StartDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
234 COMM_MEMBS rows place individuals into a community before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_entrydate;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) 6 COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
6 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_departdate, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
ICG EVL fixed BH, HAI, TZB2 in Access. Need to talk to Karl about MG, RO, WD. Treat like KL chimps in Biography?&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== * (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Have the conversion program change all the NULL values to &amp;lt;nowiki&amp;gt;&amp;#039;&amp;#039;&amp;lt;/nowiki&amp;gt;, the empty string.&lt;br /&gt;
&lt;br /&gt;
== * (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#21) There are 6 rows in BIOGRAPHY_LOG where the Description is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=325</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=325"/>
		<updated>2024-02-01T20:02:50Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#15) TT is placed in a community twice on the same day */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Add KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
&lt;br /&gt;
== * (#13) 234 COMM_MEMBS rows place individuals in a community before their BIOGRAPHY.StartDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
234 COMM_MEMBS rows place individuals into a community before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_entrydate;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) 6 COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
6 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_departdate, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== * (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Have the conversion program change all the NULL values to &amp;lt;nowiki&amp;gt;&amp;#039;&amp;#039;&amp;lt;/nowiki&amp;gt;, the empty string.&lt;br /&gt;
&lt;br /&gt;
== * (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#21) There are 6 rows in BIOGRAPHY_LOG where the Description is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=324</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=324"/>
		<updated>2024-02-01T20:02:37Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* Solution */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Add KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
&lt;br /&gt;
== * (#13) 234 COMM_MEMBS rows place individuals in a community before their BIOGRAPHY.StartDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
234 COMM_MEMBS rows place individuals into a community before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_entrydate;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) 6 COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
6 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_departdate, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 &lt;br /&gt;
Changed end date of KK_P1 membership to 8/14/2022&lt;br /&gt;
&lt;br /&gt;
ICG&lt;br /&gt;
&lt;br /&gt;
== * (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Have the conversion program change all the NULL values to &amp;lt;nowiki&amp;gt;&amp;#039;&amp;#039;&amp;lt;/nowiki&amp;gt;, the empty string.&lt;br /&gt;
&lt;br /&gt;
== * (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#21) There are 6 rows in BIOGRAPHY_LOG where the Description is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=323</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=323"/>
		<updated>2024-02-01T20:01:46Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* Solution */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Add KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
&lt;br /&gt;
== * (#13) 234 COMM_MEMBS rows place individuals in a community before their BIOGRAPHY.StartDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
234 COMM_MEMBS rows place individuals into a community before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_entrydate;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) 6 COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
6 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_departdate, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Fixed in MS Access 2/1/2024 ICG&lt;br /&gt;
&lt;br /&gt;
== * (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Have the conversion program change all the NULL values to &amp;lt;nowiki&amp;gt;&amp;#039;&amp;#039;&amp;lt;/nowiki&amp;gt;, the empty string.&lt;br /&gt;
&lt;br /&gt;
== * (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#21) There are 6 rows in BIOGRAPHY_LOG where the Description is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=322</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=322"/>
		<updated>2024-02-01T19:53:15Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* Solution */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Add KL and KL_KK to CommIds.  KL = Kalande, KL_KK= Kasekela/kalande&lt;br /&gt;
&lt;br /&gt;
== * (#13) 234 COMM_MEMBS rows place individuals in a community before their BIOGRAPHY.StartDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
234 COMM_MEMBS rows place individuals into a community before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_entrydate;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) 6 COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
6 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_departdate, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Have the conversion program change all the NULL values to &amp;lt;nowiki&amp;gt;&amp;#039;&amp;#039;&amp;lt;/nowiki&amp;gt;, the empty string.&lt;br /&gt;
&lt;br /&gt;
== * (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#21) There are 6 rows in BIOGRAPHY_LOG where the Description is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
	<entry>
		<id>https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=321</id>
		<title>Conversion Data Issues</title>
		<link rel="alternate" type="text/html" href="https://sokwe.janegoodall.org/w/index.php?title=Conversion_Data_Issues&amp;diff=321"/>
		<updated>2024-02-01T19:43:38Z</updated>

		<summary type="html">&lt;p&gt;IanGilby: /* * (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- Using sections, because avoiding any empty lines while using numbered lists is painful.  --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Numbering, so that we can refer to problems by number.&lt;br /&gt;
     Please don&amp;#039;t change the numbering. --&amp;gt;&lt;br /&gt;
This page lists all the problems with the data that were encountered during the data conversion process, and how the issue was resolved.&lt;br /&gt;
&lt;br /&gt;
The problems are numbered, in the order in which they were encountered during the conversion.&lt;br /&gt;
&lt;br /&gt;
Given a choice, earlier problems should be solved before later problems.&lt;br /&gt;
This allows the later steps in the conversion to receive &amp;quot;correct&amp;quot; data, which help eliminate spurious problems, and aids the discovery of problems hidden by bad data.&lt;br /&gt;
&lt;br /&gt;
Unsolved problems are marked with an &amp;lt;code&amp;gt;*&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== (#1) FOLLOW_MAP_TIME duplicate keys ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The data dump says that the &amp;lt;code&amp;gt;FOLLOW_MAP_TIME&amp;lt;/code&amp;gt; table has a primary key consisting of, in order, the columns: &amp;lt;code&amp;gt;FMT_FOL_date&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_FOL_B_focal_AnimID&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;FMT_time&amp;lt;/code&amp;gt;.&lt;br /&gt;
But these columns contain duplicate values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
The duplicate values can be listed (from the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema) with:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;SELECT *&lt;br /&gt;
  FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
    JOIN (SELECT &amp;quot;FMT_FOL_date&amp;quot; AS the_date&lt;br /&gt;
               , &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; AS the_animid&lt;br /&gt;
               , &amp;quot;FMT_time&amp;quot; AS the_time&lt;br /&gt;
            FROM &amp;quot;FOLLOW_MAP_TIME&amp;quot;&lt;br /&gt;
            GROUP BY &amp;quot;FMT_FOL_date&amp;quot;, &amp;quot;FMT_FOL_B_focal_AnimID&amp;quot;, &amp;quot;FMT_time&amp;quot;&lt;br /&gt;
            HAVING count(*) &amp;gt; 1&lt;br /&gt;
         ) AS fmt&lt;br /&gt;
      ON (&amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_date&amp;quot; = fmt.the_date&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_FOL_B_focal_AnimID&amp;quot; = fmt.the_animid&lt;br /&gt;
          AND &amp;quot;FOLLOW_MAP_TIME&amp;quot;.&amp;quot;FMT_time&amp;quot; = fmt.the_time);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access 11/17/23 by ICG. Checked all against Tikis.&lt;br /&gt;
&lt;br /&gt;
== (#2) SUBADULT_ARRIVALS_LOG has a textual SA_first_tiki_date column ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The SUBADULT_ARRIVALS_LOG table has a column that is supposed to contain a date, but instead contains the string &amp;quot;TEXTY&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Likely, the entire row is bad.  The row contains:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
       SA_B_AnimID | SA_first_tiki_date | SA_notes&lt;br /&gt;
       -------------+--------------------+----------&lt;br /&gt;
	TEXTY       | TEXTY              | TEXTY&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from  &amp;quot;SUBADULT_ARRIVALS_LOG&amp;quot; where &amp;quot;SA_first_tiki_date&amp;quot; = &amp;#039;TEXTY&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG.&lt;br /&gt;
Deleted row&lt;br /&gt;
&lt;br /&gt;
== (#3) BRECORD_NOTES contains rows where BREC_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 4 rows in BRECORD_NOTES where the BREC_FOL_date column, supposedly a date, contains time values that are not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_FOL_date&amp;quot;::TIME &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#4) MATING_EVENT contains rows where M_FOL_date has values that are not just a date ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The MATING_EVENT table contains 1 row where the time portion of M_FOL_date is not &amp;#039;00:00:00&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from &amp;quot;MATING_EVENT&amp;quot; where &amp;quot;M_FOL_date&amp;quot;::TIME WITHOUT TIME ZONE &amp;lt;&amp;gt; &amp;#039;00:00:00&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Fixed in MS Access, 11/16/2023, ICG&lt;br /&gt;
Deleted time stamp&lt;br /&gt;
&lt;br /&gt;
== (#5) BIOGRAPHY.DepartdateError data discarded ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
The BIOGRAPHY.DepartdateError column contains data that was, after discussion, determined to be unusable.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_DepartdateError&amp;quot; &amp;lt;&amp;gt; 0;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
Do not convert the data.  There is no corresponding column in the new db design.&lt;br /&gt;
&lt;br /&gt;
==  (#6) BIOGRAPHY.B_AnimID_num column contains the empty string ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains the empty string, instead of NULL.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BIOGRAPHY&amp;quot; where &amp;quot;B_AnimID_num&amp;quot; = &amp;#039;&amp;#039;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed 11 empty string values to NULL in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#7) BIOGRAPHY.B_AnimID_num column is textual ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.B_AnimID_num column contains has a data type of TEXT.&lt;br /&gt;
The data values all begin with &amp;quot;CH&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_AnimID_num&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_AnimID_num&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        AND &amp;quot;B_AnimID_num&amp;quot; IS DISTINCT FROM NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Removed the &amp;quot;CH&amp;quot; prefix and made the column an integer in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#8) BIOGRAPHY.DadID_publication_info column contains NULL values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID_publication_info column has a data type that allows &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; values.&lt;br /&gt;
SokweDB wants only text&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;This query no longer reports results because the data was changed in the original MS Access data.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; IS NULL&lt;br /&gt;
  order by &amp;quot;B_AnimID&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Changed the &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; to the empty string (&amp;quot;&amp;quot;) in MS Access. ICG 12/6/2023&lt;br /&gt;
&lt;br /&gt;
== (#9) BIOGRAPHY.DadID column contains non-AnimID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
The BIOGRAPHY.DadID column has a non-AnimID values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select &amp;quot;B_AnimID&amp;quot;, &amp;quot;B_DadID&amp;quot;&lt;br /&gt;
  from raw.&amp;quot;BIOGRAPHY&amp;quot;&lt;br /&gt;
  where &amp;quot;B_DadID&amp;quot; is not null&lt;br /&gt;
        and &amp;quot;B_DadID&amp;quot; &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from raw.&amp;quot;BIOGRAPHY&amp;quot; as search&lt;br /&gt;
                          where search.&amp;quot;B_AnimID&amp;quot; = raw.&amp;quot;BIOGRAPHY&amp;quot;.&amp;quot;B_DadID&amp;quot;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create the DadIDPrelim column in a BIOGRAPHY_DATA table, and make a BIOGRAPHY view that combines Dad_ID and DadIDPrelim into DadID -- adding the &amp;#039;_prelim&amp;#039; suffix as expected.&lt;br /&gt;
&lt;br /&gt;
== (#10) BRECORD_NOTES contains rows where BREC_time has values that are not just a time ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 56 rows in BRECORD_NOTES where the BREC_time column, supposedly a time, contains date values that are not &amp;#039;1899-12-30&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;raw&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from raw.&amp;quot;BRECORD_NOTES&amp;quot; where &amp;quot;BREC_time&amp;quot;::DATE &amp;lt;&amp;gt; &amp;#039;1899-12-30&amp;#039;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
The data is fixed in the conversion process.&lt;br /&gt;
&lt;br /&gt;
== * (#11) KAZ has b_dadid_publication_info, but b_dadid is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;KAZ&amp;lt;/code&amp;gt; has dad id publication info of &amp;#039;Rudicell et al. 2010&amp;#039;, but a &amp;lt;code&amp;gt;NULL&amp;lt;/code&amp;gt; dadid.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * &lt;br /&gt;
  from clean.biography&lt;br /&gt;
  where b_dadid is NULL&lt;br /&gt;
        and (b_dadid_publication_info &amp;lt;&amp;gt; &amp;#039;&amp;#039;&lt;br /&gt;
             or b_dadid_publication_info is null);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#12) 9 BIOGRAPHY rows have BirthComm values that are not COMM_IDS.CommID values ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 9 BIOGRAPHY rows with BirthComm values that are not valid COMM_IDS, their communities do not exist.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from easy.biography&lt;br /&gt;
  where b_birthgroup is not NULL&lt;br /&gt;
        and not exists (select 1&lt;br /&gt;
                          from easy.community_lookup&lt;br /&gt;
                          where community_lookup.cl_community_id&lt;br /&gt;
                                = biography.b_birthgroup);&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#13) 234 COMM_MEMBS rows place individuals in a community before their BIOGRAPHY.StartDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
234 COMM_MEMBS rows place individuals into a community before BIOGRAPHY says they&lt;br /&gt;
entered the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_entrydate, cm.cm_start_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_start_date &amp;lt; b.b_entrydate;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#14) 6 COMM_MEMBS rows place individuals in a community after their BIOGRAPHY.EndDate ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
6 COMM_MEMBS rows place individuals into a community after BIOGRAPHY says they&lt;br /&gt;
left the community.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select b.b_animid, b.b_departdate, cm.cm_end_date&lt;br /&gt;
  from clean.community_membership as cm&lt;br /&gt;
    join clean.biography as b&lt;br /&gt;
         on (b.b_animid = cm.cm_b_animid)&lt;br /&gt;
  where cm.cm_end_date &amp;gt; b.b_departdate;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#15) TT is placed in a community twice on the same day ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
An individual may not be in more than one community (or even twice in the same community) on any given day.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select first.cm_b_animid as anim_id&lt;br /&gt;
     , first.cm_start_date as first_start_date&lt;br /&gt;
     , first.cm_end_date as first_end_date&lt;br /&gt;
     , first.cm_cl_community_id as first_community_id&lt;br /&gt;
     , first.cm_start_source as first_start_source&lt;br /&gt;
     , first.cm_end_source as first_end_source&lt;br /&gt;
     , second.cm_start_date as second_start_date&lt;br /&gt;
     , second.cm_end_date as second_end_date&lt;br /&gt;
     , second.cm_cl_community_id as second_community_id&lt;br /&gt;
     , second.cm_start_source as second_start_source&lt;br /&gt;
     , second.cm_end_source as second_end_source&lt;br /&gt;
  from clean.community_membership as first&lt;br /&gt;
    join clean.community_membership as second&lt;br /&gt;
         on (first.cm_b_animid = second.cm_b_animid&lt;br /&gt;
             and first.cm_start_date &amp;lt; second.cm_start_date)&lt;br /&gt;
  where first.cm_end_date &amp;gt;= second.cm_start_date;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#16) There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow NULL values.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Have the conversion program change all the NULL values to &amp;lt;nowiki&amp;gt;&amp;#039;&amp;#039;&amp;lt;/nowiki&amp;gt;, the empty string.&lt;br /&gt;
&lt;br /&gt;
== * (#17) There are 64 rows in COMM_MEMB_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 64 rows in COMM_MEMB_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select date_of_update, chimp_id&lt;br /&gt;
  from clean.community_membership_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
== * (#18) There are 31 rows in COMM_MEMB_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 31 rows in COMM_MEMB_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.community_membership_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
&lt;br /&gt;
This allows us to make the unknown person &amp;quot;inactive&amp;quot;, preventing them from being used&lt;br /&gt;
in newly entered data.  The alternative, allowing NULL MadeBy values, would allow&lt;br /&gt;
new &amp;quot;bad data&amp;quot;, and NULL values make querying harder.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 18 rows in BIOGRAPHY_LOG where MadeBy is NULL, and the column does not allow&lt;br /&gt;
NULLs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where made_by is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
Create an unknown person (UNK), and when the person is NULL, use the unkonwn person.&lt;br /&gt;
(See Problem #18)== * (#19) There are 18 rows in BIOGRAPHY_LOG where the MadeBy is NULL ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#20) There are 3 rows in BIOGRAPHY_LOG where the Rationale is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 3 rows in BIOGRAPHY_LOG where Rationale is NULL.  The column does not allow&lt;br /&gt;
NULLs; normally the conversion program would convert NULL to the empty string.&lt;br /&gt;
But the Rationale column requires there be (non-empty) textual data.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_rationale is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#21) There are 6 rows in BIOGRAPHY_LOG where the Description is NULL ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 6 rows in BIOGRAPHY_LOG where Description is NULL.&lt;br /&gt;
It appears that the description was put into the Rationale column, sometimes along with some rationale.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select * from clean.biography_update_log where update_description is null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#22) There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not on BIOGRAPHY ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 85 rows in BIOGRAPHY_LOG where the chimp_id is not a BIOGRAPHY.AnimID.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.biography&lt;br /&gt;
                      where biography.b_animid = log.chimp_id);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== * (#23) There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE ==&lt;br /&gt;
&lt;br /&gt;
=== Problem ===&lt;br /&gt;
&lt;br /&gt;
There are 65 rows in BIOGRAPHY_LOG where the MadeBy is not on PEOPLE, but there is MadeBy data.&lt;br /&gt;
&lt;br /&gt;
These are &amp;quot;combination&amp;quot; ID errors, where multiple people are entered instead&lt;br /&gt;
of a single people code.&lt;br /&gt;
&lt;br /&gt;
=== Bad Data ===&lt;br /&gt;
&lt;br /&gt;
In the &amp;lt;code&amp;gt;clean&amp;lt;/code&amp;gt; schema run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
select *&lt;br /&gt;
  from clean.biography_update_log as log&lt;br /&gt;
  where not exists (select 1&lt;br /&gt;
                      from clean.people&lt;br /&gt;
                      where people.person = log.made_by)&lt;br /&gt;
        and made_by is not null;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Solution ===&lt;/div&gt;</summary>
		<author><name>IanGilby</name></author>
	</entry>
</feed>